pith. sign in

arxiv: 2606.29172 · v1 · pith:YRKDZ3ZVnew · submitted 2026-06-28 · ⚛️ physics.ao-ph

CORDEX-ML-Bench: A Benchmark for Data-Driven Regional Climate Downscaling -Experiment Design and Overview

Pith reviewed 2026-06-30 02:26 UTC · model grok-4.3

classification ⚛️ physics.ao-ph
keywords regional climate downscalingmachine learninggenerative modelsprecipitation extremesclimate benchmarksperfect model experimentCORDEX
0
0 comments X

The pith

A new benchmark finds generative models outperform deterministic ones for precipitation downscaling while historical-only training underestimates future climate signals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets up CORDEX-ML-Bench as a standardized framework to compare machine learning methods for turning coarse climate data into 10 km regional projections of temperature and precipitation. It runs 40 independent model configurations through a perfect-model test across the European Alps, New Zealand, and Southern Africa. Generative approaches based on diffusion, flow matching, and GANs handle fine-scale precipitation variability and extremes more reliably than deterministic networks. Training on historical data alone produces systematic underestimation of future change signals, while adding future-period data improves fidelity.

Core claim

In the perfect-model experimental design, generative models consistently outperform deterministic approaches for precipitation by better capturing fine-scale variability and extremes; for temperature the advantage narrows and deterministic architectures remain competitive. Models trained solely on the historical period systematically underestimate future climate-change signals, while those trained on both historical and future periods perform better.

What carries the argument

The perfect-model experimental design that creates an empirical-statistical downscaling pseudo-reality for historical-only training versus an Emulator setup that also includes future periods, then scores all models against a core set of downscaling-specific metrics.

If this is right

  • Generative models should be prioritized when the target variable is precipitation because they reduce errors in extremes.
  • Any operational downscaling system trained only on historical data risks understating future climate-change signals.
  • Including future-period data during training improves projection skill for both temperature and precipitation.
  • Standardized multi-model benchmarks are required before ML methods can be treated as operationally ready for CMIP7-era regional projections.
  • Temperature downscaling remains competitive with simpler deterministic networks, so architecture choice can be task-specific.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Operational climate services that rely on historically trained ML models may need periodic retraining or explicit extrapolation tests to avoid missing intensified extremes.
  • The same benchmark structure could be reused to test transferability across additional CORDEX domains or to new variables such as wind speed.
  • Hybrid training that blends historical observations with a small amount of future pseudo-data might offer a practical middle path between the two experimental periods tested here.

Load-bearing premise

The perfect-model setup using an empirical-statistical downscaling pseudo-reality accurately represents the generalization challenge that real-world ML downscaling models will face when applied to future climate conditions outside the training distribution.

What would settle it

Direct comparison of historically trained ML outputs against either dynamical downscaling runs or actual future observations in the same three regions would show whether the reported underestimation of change signals occurs in practice.

Figures

Figures reproduced from arXiv: 2606.29172 by Andrew Orr, Antoine Doury, Ben Booth, Caroline Hardy, Erika Coppola, Francois Engelbrecht, Henry Addison, Hugo Kyo Lee, Jessica Steinkopf, Jorge Ba\~no-Medina, Jos\'e Gonz\'alez-Abad, Jos\'e M. Guti\'errez, Joshua Oldham-Dorrington, Jr-Ben Tian, Julius Polz, Ko-Chih Wang, Luca Glawion, Maria Laura Bettolli, Martin S. J. Rogers, Martin Widmann, Matias Olmo, Maybritt Schillinger, Mikel N. Legasa, Mikhail Ivanov, Neelesh Rampal, Pedro M. M. Soares, Peter A. G. Watson, Peter B. Gibson, Ram\'on Fuentes-Franco, Ricardo Tom\'e, Serafina Di Gioia, Shivani Sharma, Stefan Sobolowski, Tom Wetherell, Valentina Blasone, Wenchang Tang, Yi-Chi Wang.

Figure 1
Figure 1. Figure 1: CORDEX-ML-Bench Experimental Design. The top panel provides a schematic overview of the benchmark, summarizing the spatial domains, training period/experiment, machine-learning architectures, and evaluation periods. The bottom panel shows the regional domains in greater detail, including the coarse-resolution predictor domains (dashed black lines), the nested high-resolution target domains (solid red lines… view at source ↗
Figure 2
Figure 2. Figure 2: Schematic of the perfect (training phase, left) and imperfect (application phase, right) frameworks used in CORDEX-ML-Bench. In the perfect framework, high-resolution RCM fields are coarsened to produce low-resolution predictor fields (X), which are paired with high￾resolution RCM target fields (Y) for training. RCM-derived target fields (Y) are used for evalua￾tion in both the perfect and imperfect settin… view at source ↗
Figure 3
Figure 3. Figure 3: An extreme-event case study is included to illustrate predictions from different ML models in the RCM Emulator experiment across all three domains. The selected event (2000- 08-26 for the European Alps, 1999-12-11 for South Africa, and 1990-02-13 for New Zealand) corresponds to the hottest day from the cross-validation historical period (1981–2010), based on area-averaged daily maximum temperature, represe… view at source ↗
Figure 4
Figure 4. Figure 4: Same as [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: TXx climatology (mean annual maximum daily maximum temperature) over the cross-validation period (1981–2000) for the RCM Emulator experiment. Layout as in [PITH_FULL_IMAGE:figures/full_fig_p023_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Same as [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Climate-change signal in TXx (∆TXx, K) for the models trained on the historical period only (ESD experiment), defined as the difference in mean annual maximum daily max￾imum temperature between the mid-century (2041–2060) and historical (1981–2000) periods. Each row corresponds to one domain (Alps, Southern Africa, New Zealand); the reference signal is shown in the large right-hand panel with the area-mean… view at source ↗
Figure 8
Figure 8. Figure 8: As [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Climate-change signal in Rx1day (∆Rx1day, %) for the ESD experiment, defined as the relative change in mean annual maximum one-day precipitation between mid-century (2041–2060) and the historical period (1981–2000). Models shown are RCMGEM-mv-orog, FlowMatching-v1, and ResGAN-v2 (top); CNRM-UNeT, ANN, and Rossby-UNet (bottom). The area-mean ∆ and RMSE of the change field are annotated per panel. The inset … view at source ↗
Figure 10
Figure 10. Figure 10: As [PITH_FULL_IMAGE:figures/full_fig_p029_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Ranking scorecard for precipitation (pr) in the ESD (left) and RCM Emulator experiment (right). Within each scorecard, cell shading indicates relative rank for each met￾ric (column). Each cell is split diagonally, with the shading of the upper-left triangle showing the rank during the historical cross-validation (1981—2010) period and the lower-right triangle showing the rank during the out-of-sample futu… view at source ↗
Figure 12
Figure 12. Figure 12: As [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Overall rank summary across all experiments and evaluation periods for both precipitation and temperature. Each panel shows the overall rank (aggregated across all metrics and domains) for each model in each of the four experiment–variable combinations: precipitation ESD, precipitation Emulator, temperature ESD, and temperature Emulator. Within each cell, the upper-left triangle shows the cross-validation… view at source ↗
Figure 14
Figure 14. Figure 14: Skill–compute relationship across CORDEX-ML-Bench models. The hori￾zontal axis shows inference cost as wall-clock time per simulated year per ensemble mem￾ber, expressed as a single A100-GPU equivalent. To enable hardware-agnostic comparison, recorded times are scaled by the approximate throughput of each GPU class relative to an A100 (H100/H200/GH200: ×2.0; A40: ×0.65; A30/A10: ×0.55; V100: ×0.50; RTX 60… view at source ↗
read the original abstract

Machine learning (ML) has emerged as a cost-effective approach to complement dynamical downscaling for producing high-resolution regional climate projections. However, the absence of standardised training and evaluation protocols, applied consistently across multiple domains, continues to hinder meaningful model intercomparison. We introduce CORDEX-ML-Bench, a benchmark aligned with CORDEX, which constitutes the first phase of a community initiative to advance data-driven downscaling toward operational readiness, and complement future dynamical downscaling efforts under CMIP7. The framework targets downscaled daily maximum temperature and precipitation to ~10 km resolution (20x increase) across three pilot regions; European Alps, New Zealand, and Southern Africa. Using a perfect-model experimental design, we evaluate 40 ML configurations developed independently, spanning traditional ML, convolutional U-Nets, vision transformers, graph neural networks, and generative models based on diffusion, flow matching, and generative adversarial networks. Models are trained under two experimental periods, an empirical-statistical downscaling pseudo-reality (historical period only) and Emulator (historical and future periods) -and are evaluated against a core set of metrics developed specifically for assessing downscaling skill. Generative models consistently outperform deterministic approaches for precipitation, better capturing fine-scale variability and extremes. For temperature, the generative advantage narrows and deterministic architectures remain competitive. Models trained solely on the historical period systematically underestimate future climate-change signals while those additionally trained on a future period perform better. These findings raise concerns about historically trained models widely used in an operational setting, underscoring the need for rigorous extrapolation testing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CORDEX-ML-Bench, the first phase of a community benchmark for ML-based regional climate downscaling aligned with CORDEX protocols. It uses a perfect-model experimental design to downscale daily maximum temperature and precipitation to ~10 km resolution across the European Alps, New Zealand, and Southern Africa. Forty independently developed ML configurations (traditional ML, U-Nets, vision transformers, graph neural networks, and generative models based on diffusion, flow matching, and GANs) are evaluated under two training regimes: historical-only empirical-statistical downscaling pseudo-reality and an Emulator that includes future periods. Core findings are that generative models outperform deterministic approaches for precipitation (better capturing variability and extremes) while the advantage narrows for temperature, and that historical-only training systematically underestimates future climate-change signals.

Significance. If the benchmark protocols and findings hold under scrutiny, the work provides a much-needed standardized framework for intercomparing data-driven downscaling methods, directly addressing the lack of consistent training/evaluation protocols noted in the abstract. Explicit credit is due for the community-oriented design, the breadth of 40 model configurations spanning multiple architectures, the development of downscaling-specific metrics, and the focus on extrapolation testing between historical and future periods. These elements position the benchmark as a useful complement to dynamical downscaling efforts under CMIP7.

major comments (2)
  1. [Abstract and Methods] Abstract and Methods (perfect-model experimental design): The central claim that 'Models trained solely on the historical period systematically underestimate future climate-change signals' rests on the empirical-statistical downscaling pseudo-reality being a faithful proxy for real-world distribution shift. However, because the high-resolution target is generated from the same dynamical core and forcing as the low-resolution driver, non-stationarity is limited to internal variability and resolution-dependent processes within a single model. This design choice risks understating structural biases (e.g., altered convective schemes or aerosol feedbacks) present in actual GCM-to-observation or cross-GCM applications, directly affecting the operational-risk interpretation of the historical-only results.
  2. [Results] Results (generative vs. deterministic comparison): The finding that generative models 'consistently outperform deterministic approaches for precipitation' is load-bearing for the benchmark's value, yet the abstract supplies no quantitative details on the core metrics, exact model configurations, data exclusion rules, or statistical significance tests used to establish outperformance. Without these, it is impossible to determine whether the reported advantage is robust or sensitive to the chosen evaluation periods and regions.
minor comments (2)
  1. [Methods] The manuscript would benefit from an explicit table or section listing the precise definitions of the 'core set of metrics developed specifically for assessing downscaling skill' so that future users can replicate the evaluation protocol.
  2. [Experimental Design] Clarify in the experimental design whether the three pilot regions were chosen for diversity in orography/climate or simply for data availability, as this affects the generalizability claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript introducing CORDEX-ML-Bench. The comments highlight important considerations for interpreting the perfect-model design and the strength of evidence for model comparisons. We address each point below and propose targeted revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract and Methods] Abstract and Methods (perfect-model experimental design): The central claim that 'Models trained solely on the historical period systematically underestimate future climate-change signals' rests on the empirical-statistical downscaling pseudo-reality being a faithful proxy for real-world distribution shift. However, because the high-resolution target is generated from the same dynamical core and forcing as the low-resolution driver, non-stationarity is limited to internal variability and resolution-dependent processes within a single model. This design choice risks understating structural biases (e.g., altered convective schemes or aerosol feedbacks) present in actual GCM-to-observation or cross-GCM applications, directly affecting the operational-risk interpretation of the historical-only results.

    Authors: We agree that the perfect-model setup inherently constrains non-stationarity to processes within a single dynamical core, and thus cannot fully capture structural biases arising from model differences or observational mismatches in operational settings. This was a deliberate choice to enable controlled isolation of resolution effects and extrapolation behavior across the 40 configurations. The observed underestimation of future signals in historical-only training remains a robust finding within this controlled framework and serves as a conservative indicator. We will revise the Methods and Discussion sections to explicitly qualify the claim, noting that real-world risks may be larger, and add a forward-looking statement on planned extensions to cross-model and observation-based protocols in future benchmark phases. revision: yes

  2. Referee: [Results] Results (generative vs. deterministic comparison): The finding that generative models 'consistently outperform deterministic approaches for precipitation' is load-bearing for the benchmark's value, yet the abstract supplies no quantitative details on the core metrics, exact model configurations, data exclusion rules, or statistical significance tests used to establish outperformance. Without these, it is impossible to determine whether the reported advantage is robust or sensitive to the chosen evaluation periods and regions.

    Authors: The abstract is intentionally concise, but the full manuscript provides the requested details: Section 3 defines the core metrics (including precipitation-specific ones for variability and extremes), Section 4 lists all 40 configurations with architecture and training hyperparameters, data exclusion follows standard CORDEX protocols with explicit hold-out periods, and statistical significance is assessed via bootstrapped confidence intervals and paired t-tests reported in the Results and supplementary material. The outperformance holds across the three regions and evaluation periods. To improve accessibility, we will add one sentence to the abstract summarizing the magnitude of improvement (e.g., relative reduction in extreme precipitation bias) while retaining brevity. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical benchmark with no derivations or self-referential predictions

full rationale

The paper is an empirical benchmark study comparing 40 ML configurations for regional climate downscaling under a perfect-model pseudo-reality design. It reports direct performance metrics on temperature and precipitation without any claimed derivations, equations that reduce to inputs by construction, fitted parameters renamed as predictions, or load-bearing self-citations. The central findings (generative models outperforming on precipitation; historical-only training underestimating change signals) are statistical outcomes of the held-out evaluation, not tautological restatements of the experimental setup. This is the expected non-finding for a methods-and-results benchmark paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 6005 in / 1147 out tokens · 56521 ms · 2026-06-30T02:26:30.460922+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

104 extracted references · 16 canonical work pages · 6 internal anchors

  1. [1]

    Annual review of environment and resources , volume=

    Regional dynamical downscaling and the CORDEX initiative , author=. Annual review of environment and resources , volume=. 2015 , publisher=

  2. [2]

    Journal of Geophysical Research: Atmospheres , volume=

    High-resolution climate change projections of atmospheric rivers over the South Pacific , author=. Journal of Geophysical Research: Atmospheres , volume=. 2025 , publisher=

  3. [3]

    World Meteorological Organization (WMO) Bulletin , volume=

    Addressing climate information needs at the regional level: the CORDEX framework , author=. World Meteorological Organization (WMO) Bulletin , volume=

  4. [4]

    Environmental Research: Climate , year =

    Lehner, Flavio and Deser, Clara , title =. Environmental Research: Climate , year =

  5. [5]

    Earth System Dynamics , volume=

    Large ensemble climate model simulations: introduction, overview, and future prospects for utilising multiple types of large ensemble , author=. Earth System Dynamics , volume=. 2021 , publisher=

  6. [6]

    Nature climate change , volume=

    Insights from Earth system model initial-condition large ensembles and future prospects , author=. Nature climate change , volume=. 2020 , publisher=

  7. [7]

    Climate Dynamics , volume=

    Local-scale changes in mean and heavy precipitation in Western Europe, climate change or internal variability? , author=. Climate Dynamics , volume=. 2018 , publisher=

  8. [8]

    Geophysical Research Letters , volume=

    On the extrapolation of generative adversarial networks for downscaling precipitation extremes in warmer climates , author=. Geophysical Research Letters , volume=. 2024 , publisher=

  9. [9]

    Journal of Advances in Modeling Earth Systems , volume=

    A reliable generative adversarial network approach for climate downscaling and weather generation , author=. Journal of Advances in Modeling Earth Systems , volume=. 2025 , publisher=

  10. [10]

    arXiv preprint arXiv:2507.06527 , year=

    Downscaling with AI reveals the large role of internal variability in fine-scale projections of climate extremes , author=. arXiv preprint arXiv:2507.06527 , year=

  11. [11]

    Journal of Geophysical Research: Machine Learning and Computation , volume=

    Pan-European high-resolution downscaling using deep learning , author=. Journal of Geophysical Research: Machine Learning and Computation , volume=. 2025 , publisher=

  12. [12]

    Proceedings of the European conference on computer vision (ECCV) , pages=

    Cbam: Convolutional block attention module , author=. Proceedings of the European conference on computer vision (ECCV) , pages=

  13. [13]

    arXiv preprint arXiv:2509.21844 , year=

    Generative ai-downscaling of large ensembles project unprecedented future droughts , author=. arXiv preprint arXiv:2509.21844 , year=

  14. [14]

    Climate change 2013: the physical science basis

    Evaluation of climate models , author=. Climate change 2013: the physical science basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change , pages=. 2014 , publisher=

  15. [15]

    Geophysical Research Letters , volume=

    Assessing biases and climate implications of the diurnal precipitation cycle in climate models , author=. Geophysical Research Letters , volume=. 2021 , publisher=

  16. [16]

    Journal of climate , volume=

    The diurnal cycle and its depiction in the Community Climate System Model , author=. Journal of climate , volume=

  17. [17]

    Climate Dynamics , volume=

    On the use of convolutional neural networks for downscaling daily temperatures over southern South America in a climate change scenario , author=. Climate Dynamics , volume=. 2024 , publisher=

  18. [18]

    Journal of Applied Statistics , volume=

    Emulation and interpretation of high-dimensional climate model outputs , author=. Journal of Applied Statistics , volume=. 2015 , publisher=

  19. [19]

    Artificial Intelligence for the Earth Systems , volume=

    Enhancing regional climate downscaling through advances in machine learning , author=. Artificial Intelligence for the Earth Systems , volume=. 2024 , publisher=

  20. [20]

    Reviews of geophysics , volume=

    Precipitation downscaling under climate change: Recent developments to bridge the gap between dynamical models and the end user , author=. Reviews of geophysics , volume=. 2010 , publisher=

  21. [21]

    and Engelbrecht, Francois and Steinkopf, Jessica and Hardy, Caroline , title =

    Rampal, Neelesh and Gonzalez-Abad, Jose and Gibson, Peter B. and Engelbrecht, Francois and Steinkopf, Jessica and Hardy, Caroline , title =. 2026 , publisher =

  22. [22]

    Geoscientific Model Development , volume=

    WCRP coordinated regional downscaling experiment (CORDEX): a diagnostic MIP for CMIP6 , author=. Geoscientific Model Development , volume=. 2016 , publisher=

  23. [23]

    Geoscientific Model Development , volume=

    Conditional diffusion models for downscaling and bias correction of Earth system model precipitation , author=. Geoscientific Model Development , volume=. 2026 , publisher=

  24. [24]

    Current climate change reports , volume=

    Precipitation extremes under climate change , author=. Current climate change reports , volume=. 2015 , publisher=

  25. [25]

    Nature Climate Change , volume=

    Understanding the regional pattern of projected future changes in extreme precipitation , author=. Nature Climate Change , volume=. 2017 , publisher=

  26. [26]

    Journal of Geophysical Research: Atmospheres , volume=

    Deep learning for downscaling tropical cyclone rainfall to hazard-relevant spatial scales , author=. Journal of Geophysical Research: Atmospheres , volume=. 2023 , publisher=

  27. [27]

    Nature , volume=

    Skilful precipitation nowcasting using deep generative models of radar , author=. Nature , volume=. 2021 , publisher=

  28. [28]

    Weather and Climate Extremes , volume=

    High-resolution downscaling with interpretable deep learning: Rainfall extremes over New Zealand , author=. Weather and Climate Extremes , volume=. 2022 , publisher=

  29. [29]

    ISPRS Journal of Photogrammetry and Remote Sensing , volume=

    Deep learning in statistical downscaling for deriving high spatial resolution gridded meteorological data: A systematic review , author=. ISPRS Journal of Photogrammetry and Remote Sensing , volume=. 2024 , publisher=

  30. [30]

    Bulletin of the American Meteorological Society , volume=

    Potential for machine learning emulators to augment regional climate simulations in provision of local climate change information , author=. Bulletin of the American Meteorological Society , volume=. 2025 , publisher=

  31. [31]

    Geoscientific Model Development , volume=

    Configuration and intercomparison of deep learning neural models for statistical downscaling , author=. Geoscientific Model Development , volume=. 2020 , publisher=

  32. [32]

    , title =

    Baño-Medina, Jorge and Manzanas, Rodrigo and Gutiérrez, José M. , title =. Climate Dynamics , year =

  33. [33]

    Artificial Intelligence for the Earth Systems , volume=

    Transferability and explainability of deep learning emulators for regional climate model projections: Perspectives for future applications , author=. Artificial Intelligence for the Earth Systems , volume=. 2024 , publisher=

  34. [34]

    Urban Climate , volume=

    Towards an improved representation of the urban heat island effect: A multi-scale application of XGBoost for madrid , author=. Urban Climate , volume=. 2024 , publisher=

  35. [35]

    Climate Dynamics , volume=

    Regional climate model emulator based on deep learning: Concept and first evaluation of a novel hybrid downscaling approach , author=. Climate Dynamics , volume=. 2023 , publisher=

  36. [36]

    Climate Dynamics , volume=

    On the suitability of a convolutional neural network based RCM-emulator for fine spatio-temporal precipitation , author=. Climate Dynamics , volume=. 2024 , publisher=

  37. [37]

    Artificial Intelligence for the Earth Systems , volume=

    Are deep learning methods suitable for downscaling global climate projections? An intercomparison for temperature and precipitation over Spain , author=. Artificial Intelligence for the Earth Systems , volume=. 2025 , publisher=

  38. [38]

    International Journal of Climatology , volume=

    Statistical downscaling of daily precipitation over southeastern South America: Assessing the performance in extreme events , author=. International Journal of Climatology , volume=. 2022 , publisher=

  39. [39]

    Environmental Data Science , volume=

    Environmental sensor placement with convolutional Gaussian neural processes , author=. Environmental Data Science , volume=. 2023 , publisher=

  40. [40]

    Scientific Reports , volume=

    Benchmarking the geographic generalization of deep learning models for precipitation downscaling , author=. Scientific Reports , volume=. 2026 , publisher=

  41. [41]

    Journal of Advances in Modeling Earth Systems , volume=

    WeatherBench: a benchmark data set for data-driven weather forecasting , author=. Journal of Advances in Modeling Earth Systems , volume=. 2020 , publisher=

  42. [42]

    International Conference on Medical image computing and computer-assisted intervention , pages=

    U-net: Convolutional networks for biomedical image segmentation , author=. International Conference on Medical image computing and computer-assisted intervention , pages=. 2015 , organization=

  43. [43]

    Journal of Advances in Modeling Earth Systems , volume=

    Weatherbench 2: A benchmark for the next generation of data-driven global weather models , author=. Journal of Advances in Modeling Earth Systems , volume=. 2024 , publisher=

  44. [44]

    Machine learning for climate downscaling in

    Guti. Machine learning for climate downscaling in. PLOS Climate , year =

  45. [45]

    AIMIP Phase 1: systematic evaluations of AI weather and climate models

    AIMIP Phase 1: systematic evaluations of AI weather and climate models , author=. arXiv preprint arXiv:2605.06944 , year=

  46. [46]

    A benchmark dataset for meteorological downscaling , author=. Proc. International Conference on Learning Representations , year=

  47. [47]

    arXiv e-prints , pages=

    Challenges of learning multi-scale dynamics with AI weather models: Implications for stability and one solution , author=. arXiv e-prints , pages=

  48. [48]

    arXiv preprint arXiv:2509.02061 , year=

    LUCIE-3D: A three-dimensional climate emulator for forced responses , author=. arXiv preprint arXiv:2509.02061 , year=

  49. [49]

    arXiv preprint arXiv:2602.18646 , year=

    AI-Based Regional Emulation for Kilometer-Scale Dynamical Downscaling , author=. arXiv preprint arXiv:2602.18646 , year=

  50. [50]

    Atmospheric Chemistry and Physics , volume=

    Modulation of radiative aerosols effects by atmospheric circulation over the Euro-Mediterranean region , author=. Atmospheric Chemistry and Physics , volume=. 2020 , publisher=

  51. [51]

    1 global climate model: description and basic evaluation , author=

    The CNRM-CM5. 1 global climate model: description and basic evaluation , author=. Climate dynamics , volume=. 2013 , publisher=

  52. [52]

    High resolution numerical modelling of the atmosphere and ocean , pages=

    An updated description of the conformal-cubic atmospheric model , author=. High resolution numerical modelling of the atmosphere and ocean , pages=. 2008 , publisher=

  53. [53]

    Monthly Weather Review , volume=

    Using a scale-selective filter for dynamical downscaling with the conformal cubic atmospheric model , author=. Monthly Weather Review , volume=

  54. [54]

    Earth's Future , volume=

    Evaluation of dynamically downscaled CMIP6-CCAM models over Australia , author=. Earth's Future , volume=. 2023 , publisher=

  55. [55]

    Journal of Geophysical Research: Atmospheres , volume=

    High-resolution CCAM simulations over New Zealand and the South Pacific for the detection and attribution of weather extremes , author=. Journal of Geophysical Research: Atmospheres , volume=. 2023 , publisher=

  56. [56]

    Climate Dynamics , volume=

    Dynamical downscaling CMIP6 models over New Zealand: added value of climatology and extremes , author=. Climate Dynamics , volume=. 2024 , publisher=

  57. [57]

    Journal of Geophysical Research: Atmospheres , volume=

    Downscaled climate projections of tropical and ex-tropical cyclones over the southwest Pacific , author=. Journal of Geophysical Research: Atmospheres , volume=. 2025 , publisher=

  58. [58]

    International Journal of Climatology , volume=

    Comparison of three reanalysis-driven regional climate models over New Zealand: Climatology and extreme events , author=. International Journal of Climatology , volume=. 2024 , publisher=

  59. [59]

    Weather and Climate Extremes , volume=

    Simulation of an intense tropical cyclone in the conformal cubic atmospheric model and its sensitivity to horizontal resolution , author=. Weather and Climate Extremes , volume=. 2025 , publisher=

  60. [60]

    Communications Earth & Environment , volume=

    Extreme event attribution using km-scale simulations reveals the pronounced role of climate change in the Durban floods , author=. Communications Earth & Environment , volume=. 2025 , publisher=

  61. [61]

    Journal of Advances in Modeling Earth Systems , volume=

    Machine learning emulation of precipitation from km-scale UK regional climate simulations using a diffusion model , author=. Journal of Advances in Modeling Earth Systems , volume=. 2026 , publisher=

  62. [62]

    2026 , publisher =

    Rampal, Neelesh and González-Abad, Jose and Gibson, Peter and Engelbrecht, Francois and Steinkopf, Jessica and Hardy, Caroline , title =. 2026 , publisher =. doi:10.5281/zenodo.18591082 , url =

  63. [63]

    International journal of climatology , volume=

    An intercomparison of a large ensemble of statistical downscaling methods over Europe: Results from the VALUE perfect predictor cross-validation experiment , author=. International journal of climatology , volume=. 2019 , publisher=

  64. [64]

    arXiv preprint arXiv:2409.13598 , year=

    Prithvi wxc: Foundation model for weather and climate , author=. arXiv preprint arXiv:2409.13598 , year=

  65. [65]

    Wiley Interdisciplinary Reviews: Climate Change , volume=

    Added value in regional climate modeling , author=. Wiley Interdisciplinary Reviews: Climate Change , volume=. 2016 , publisher=

  66. [66]

    Earth's Future , volume=

    VALUE: A framework to validate downscaling approaches for climate change studies , author=. Earth's Future , volume=. 2015 , publisher=

  67. [67]

    Environmental Data Science , volume=

    Graph neural networks for hourly precipitation projections at the convection permitting scale with a novel hybrid imperfect framework , author=. Environmental Data Science , volume=. 2025 , publisher=

  68. [68]

    Communications Earth & Environment , volume=

    Residual corrective diffusion modeling for km-scale atmospheric downscaling , author=. Communications Earth & Environment , volume=. 2025 , publisher=

  69. [69]

    EnScale: Temporally-consistent multivariate generative downscaling via proper scoring rules

    EnScale: Temporally-consistent multivariate generative downscaling via proper scoring rules , author=. arXiv preprint arXiv:2509.26258 , year=

  70. [70]

    npj Climate and Atmospheric Science , volume=

    Global spatio-temporal ERA5 precipitation downscaling to km and sub-hourly scale using generative AI , author=. npj Climate and Atmospheric Science , volume=. 2025 , publisher=

  71. [71]

    Earth and Space Science , volume=

    spateGAN: spatio-temporal downscaling of rainfall fields using a cGAN approach , author=. Earth and Space Science , volume=. 2023 , publisher=

  72. [72]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Score-based generative modeling through stochastic differential equations , author=. arXiv preprint arXiv:2011.13456 , year=

  73. [73]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    An image is worth 16x16 words: Transformers for image recognition at scale , author=. arXiv preprint arXiv:2010.11929 , year=

  74. [74]

    arXiv preprint arXiv:2511.23043 , year=

    High-Resolution Probabilistic Data-Driven Weather Modeling with a Stretched-Grid , author=. arXiv preprint arXiv:2511.23043 , year=

  75. [75]

    arXiv preprint arXiv:2512.13987 , year=

    An intercomparison of generative machine learning methods for downscaling precipitation at fine spatial scales , author=. arXiv preprint arXiv:2512.13987 , year=

  76. [76]

    Journal of Hydrometeorology , volume=

    Probabilistic multisite precipitation downscaling by an expanded Bernoulli--Gamma density network , author=. Journal of Hydrometeorology , volume=

  77. [77]

    Wiley Interdisciplinary Reviews: Climate Change , volume=

    Indices for monitoring changes in extremes based on daily temperature and precipitation data , author=. Wiley Interdisciplinary Reviews: Climate Change , volume=. 2011 , publisher=

  78. [78]

    Nonlinear Processes in Geophysics , volume=

    An artificial neural network technique for downscaling GCM outputs to RCM spatial scale , author=. Nonlinear Processes in Geophysics , volume=. 2011 , publisher=

  79. [79]

    arXiv preprint arXiv:2501.19374 , year=

    Fixing the double penalty in data-driven weather forecasting through a modified spherical harmonic loss function , author=. arXiv preprint arXiv:2501.19374 , year=

  80. [80]

    Journal of Advances in Modeling Earth Systems , volume=

    A generative deep learning approach to stochastic downscaling of precipitation forecasts , author=. Journal of Advances in Modeling Earth Systems , volume=. 2022 , publisher=

Showing first 80 references.