Simulating Atmospheric Processes with Stochastic AI: A New Step Toward Smarter Climate Models

Simulating Atmospheric Processes with Stochastic AI: A New Step Toward Smarter Climate Models

26 May 2024

Understanding the complex interplay of atmospheric processes is one of the biggest challenges in climate science. A new study, led by EERIE project researcher Gunnar Behrens, brings forward promising advances in this direction by using machine learning (ML) to represent uncertainty in atmospheric processes — a critical yet often overlooked element in climate modeling.

The study, “Simulating Atmospheric Processes in Earth System Models and Quantifying Uncertainties With Deep Learning Multi‐Member and Stochastic Parameterizations” (Behrens et al., 2025), introduces novel ML approaches to better represent processes like convection and turbulence, which occur at scales too small to be resolved by conventional climate models.

Why Stochasticity Matters in Climate Models

Most current ML-driven climate model components provide deterministic predictions — that is, a single output for a given input. However, many atmospheric processes are inherently stochastic, meaning they can behave differently even under similar conditions. Ignoring this randomness limits the realism and accuracy of simulations.

To address this, the authors developed and tested stochastic ML-based parameterizations, allowing the model to generate a range of plausible outcomes rather than a single prediction. This ensemble approach improves both the representation of physical processes and the quantification of uncertainty — crucial for robust climate projections.

Three Paths to Smarter ML Parameterizations

The team explored three different ways to introduce stochasticity (see Fig. 1):

  1. Monte Carlo Dropout – Adding noise during inference to simulate uncertainty.
  2. Multi-member Ensembles – Using multiple independently trained neural networks to capture a range of predictions.
  3. Latent Space Perturbations – Leveraging a variational encoder-decoder to introduce controlled randomness in the model’s internal representations.

Figure 1. Overview of three stochastic AI strategies to emulate subgrid atmospheric processes: (1) Monte Carlo dropout on a single neural network, (2) ensemble of multiple deterministic networks, and (3) latent space perturbation in a variational encoder-decoder. Licensed under CC BY 4.0.

Among these, the multi-member ensemble approach showed particularly promising results in capturing the complex behavior of atmospheric convection and improved the simulation of tropical extreme precipitation.

Testing in the Real World (or as Close as it Gets)

The researchers embedded these new parameterizations in a state-of-the-art climate model using a so-called “superparameterization” framework. While full model integration remains challenging — with some hybrid simulations crashing early — a partial coupling strategy enabled stable runs over several months. Importantly, this still allowed for significant improvements in specific areas, such as the global precipitation diurnal cycle (Fig. 2), compared to traditional schemes.

Figure 2. Global precipitation diurnal cycle across models and observations. This figure shows the local solar time of the daily peak in precipitation from February to May 2013, highlighting regions with a pronounced diurnal cycle. It compares output from several climate model configurations: (a) a superparameterized version of CESM (SP-CESM), (b) the novel deterministic ensemble scheme (DNN-SP-CESM), (c) the new stochastic DNN-ensemble parameterization (DNN-ens-SP-CESM), and (d) the traditional Zhang-McFarlane scheme (ZM-CESM). For reference, observational data from the GPM IMERG satellite product are shown after (e) second-order and (f) first-order conservative remapping. White areas indicate regions where the diurnal cycle was too weak to be included. Licensed under CC BY 4.0, Behrens et al., 2025, their figure 8.

The team also analysed the uncertainties of their ML-based ensemble parameterizations represented by the spread of the ensemble members when partially coupled to the climate model. They found that the novel parameterization showed higher uncertainties in the proximity of mountain ranges, strong surface ocean currents and along the intertropical convergence zone due to deep convective processes (Fig. 3). This result indicates that ML-based ensemble and stochastic parameterization provide not random uncertainties but rather learn realistic uncertainties due to well-known drivers of subgrid atmospheric processes in the Earth system.

Figure 3. Mean interquartile range ( = between the 75th percentile and the 25th percentile of the members) of the novel deterministic ensemble scheme (DNN-SP-CESM) in February 2013 when partially coupled to the climate model. Panel a) shows the mean interquartile range for specific humidity tendency and panel b) for the temperature tendency in the upper planetary boundary layer on a reference pressure of 831 hPa. Panel c) and d) show the respective interquartile ranges of the snow and precipitation rates. Examples of higher uncertainties due to drivers of atmospheric processes are highlighted. Licensed under CC BY 4.0, Behrens et al., 2025, their figure S54 in the supporting information.

Uncertainty estimates for next generation ESMs using EERIE simulations with ML

These novel ML-based stochastic and ensemble parameterizations provide reliable uncertainties for subgrid atmospheric processes in climate models. This work shows ways forward to simulate subgrid atmospheric processes with uncertainties in coarse climate models leveraging the presence of oceanic eddies in innovative EERIE simulations – Improving our understanding of air-sea interactions and reducing longstanding biases in the next generation of Earth system models with ML.