Recognition: 2 theorem links
· Lean TheoremHealDA: Highlighting the importance of initial errors in end-to-end AI weather forecasts
Pith reviewed 2026-05-16 11:50 UTC · model grok-4.3
The pith
A simple machine learning data assimilation system provides initial conditions for off-the-shelf AI weather models that lose less than one day of lead time against ERA5.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HealDA functions strictly as a data assimilation module whose analyses initialize off-the-shelf ML forecast models. For models including FCN3, Aurora, and FengWu, these initialized forecasts lose less than one day of lead time when scored against ERA5, while FCN3 ensembles trail the ECMWF IFS ENS system by less than 24 hours. Forecast error growth stays unchanged from HealDA initialization, and the skill gap arises primarily from larger initial errors that spectral analysis attributes to overfitting on large scales and upper-tropospheric fields.
What carries the argument
HealDA, the direct observation-to-state neural network that converts a short window of observations into a 1° HEALPix atmospheric analysis without iterative steps.
If this is right
- Error growth rates in the ML forecast models stay the same whether initialized by HealDA or by NWP analyses.
- The skill gap originates mainly from higher initial errors concentrated at large scales and in the upper troposphere.
- Verification setup variations can alter apparent skill differences by 12-24 hours, requiring consistent scoring.
- A direct-mapping ML DA system already supplies initial conditions usable by current state-of-the-art ML forecast models.
Where Pith is reading between the lines
- Reducing overfitting on large scales inside HealDA would likely close most of the remaining skill gap.
- This direct-mapping approach could support faster, lower-cost end-to-end ML weather pipelines by cutting dependence on full NWP assimilation infrastructure.
- Future progress in AI weather forecasting may depend more on improving initial-condition quality than on further model architecture changes.
Load-bearing premise
The selected off-the-shelf ML forecast models represent the broader class of AI weather models and the chosen verification metrics and observation window fairly capture operational differences.
What would settle it
A side-by-side plot of error-growth curves for HealDA-initialized versus ERA5-initialized runs in an additional ML model, or the same comparison repeated with verification metrics focused on small-scale fields.
Figures
read the original abstract
AI weather models now rival leading numerical weather prediction (NWP) systems in medium-range skill. However, almost all still rely on NWP data assimilation (DA) to provide initial conditions, tying them to expensive infrastructure and limiting the practical speed and accuracy gains of ML. More recently, ML-based DA systems have been proposed, which are often trained and evaluated end-to-end with a forecast model, making it difficult to assess the quality of their analysis fields. We introduce HealDA, a global ML-based DA system that maps a short window of satellite and conventional observations directly to a 1{\deg} atmospheric state on the HEALPix grid, using a smaller sensor suite than operational NWP. We treat HealDA strictly as a DA module: its analyses are used to initialize off-the-shelf ML forecast models without any fine-tuning of either. For a variety of off-the-shelf ML forecast models, including FourCastNet3 (FCN3), Aurora, and FengWu, HealDA-initialized forecasts lose less than one day of effective lead time when scored against ERA5. HealDA-initialized FCN3 ensembles similarly trail those of the ECMWF IFS ENS system by < 24 h. We find that forecast error growth in these models is unchanged from HealDA initialization, and the skill gap primarily arises from the larger initial error of the HealDA analysis. Spectral analysis reveals that this stems from overfitting to the large scales and upper-tropospheric fields. We also demonstrate that small changes in the verification setup can shift apparent skill by 12--24h, underscoring the need for consistent scoring. Taken together, these results clarify the current performance of ML-based DA systems and show that a relatively simple, direct observation-to-state network can already provide initial conditions that are usable by state-of-the-art ML forecast models with only modest loss in medium-range skill.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces HealDA, a global ML-based data assimilation system that maps a short window of satellite and conventional observations directly to a 1° atmospheric state on the HEALPix grid. Treating HealDA strictly as a DA module, the authors initialize off-the-shelf ML forecast models (FourCastNet3, Aurora, FengWu) without fine-tuning and report that the resulting forecasts lose less than one day of effective lead time when scored against ERA5, with unchanged error growth rates relative to ERA5-initialized runs. The skill gap is attributed primarily to larger initial errors arising from overfitting to large scales and upper-tropospheric fields, supported by spectral analysis. HealDA-initialized FCN3 ensembles trail ECMWF IFS ENS by <24 h. The paper also shows that small changes in verification setup can shift apparent skill by 12-24 h.
Significance. If the central quantitative claims prove robust, the work is significant for clarifying the role of initial-condition errors in AI weather models and demonstrating that a relatively simple, direct observation-to-state ML DA system can deliver usable initial conditions for state-of-the-art forecast models with only modest medium-range skill loss. The cross-model empirical tests, ensemble comparisons, and spectral diagnosis of error sources provide concrete evidence that initial-error magnitude, rather than altered error growth, drives the performance gap. This could reduce reliance on expensive NWP DA infrastructure.
major comments (3)
- [Results (lead-time and error-growth comparisons)] The central claim that HealDA-initialized forecasts lose <1 day of effective lead time (and exhibit unchanged error growth) is load-bearing for the paper's conclusions, yet the manuscript itself reports that small changes in verification setup shift apparent skill by 12-24 h. Without systematic sensitivity tests across the specific choices of scoring metric, reference threshold, pressure levels/variables, and ERA5 vs. independent observations for the HealDA vs. control comparisons, it is unclear whether the quantitative bound holds under alternative but plausible protocols.
- [Error growth analysis subsection] The statement that forecast error growth remains unchanged from HealDA initialization requires explicit quantitative support, such as fitted growth rates with confidence intervals or statistical tests comparing HealDA-initialized vs. ERA5-initialized trajectories, to confirm the difference is not significant given the larger initial errors.
- [Spectral analysis section] The spectral analysis attributing initial errors to overfitting on large scales and upper-tropospheric fields is used to explain the skill gap; the manuscript should specify the exact spectral bands, variables, and quantitative metric (e.g., power spectrum ratio or scale-dependent RMSE) used to identify this overfitting and demonstrate it is not an artifact of the chosen verification window.
minor comments (3)
- Figure captions should explicitly label all curves (model, initialization method, ensemble vs. deterministic) and include the verification metric and reference dataset for immediate readability.
- The abstract states results for 'a variety of off-the-shelf ML forecast models' but the main text should list all tested models and any selection criteria if additional models beyond FCN3, Aurora, and FengWu were evaluated.
- Consider adding a summary table of effective lead-time losses broken down by model, variable, and pressure level to complement the narrative claims.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. The comments highlight important aspects of robustness in our quantitative claims. We have revised the manuscript to incorporate additional analyses addressing each major point, as detailed below.
read point-by-point responses
-
Referee: [Results (lead-time and error-growth comparisons)] The central claim that HealDA-initialized forecasts lose <1 day of effective lead time (and exhibit unchanged error growth) is load-bearing for the paper's conclusions, yet the manuscript itself reports that small changes in verification setup shift apparent skill by 12-24 h. Without systematic sensitivity tests across the specific choices of scoring metric, reference threshold, pressure levels/variables, and ERA5 vs. independent observations for the HealDA vs. control comparisons, it is unclear whether the quantitative bound holds under alternative but plausible protocols.
Authors: We agree that systematic sensitivity testing strengthens the central claim. In the revised manuscript we have added a dedicated sensitivity analysis subsection. This includes tests varying the scoring metric (RMSE versus anomaly correlation coefficient), reference thresholds for effective lead time, and pressure levels/variables (Z500, T850, U200). The <1-day effective lead-time loss remains consistent across these choices, with variations of 12-24 h as previously noted. All comparisons use ERA5 as the common reference for both HealDA and control runs to maintain fairness. We also discuss the practical limitations of independent global observations and why ERA5 provides the most consistent benchmark. revision: yes
-
Referee: [Error growth analysis subsection] The statement that forecast error growth remains unchanged from HealDA initialization requires explicit quantitative support, such as fitted growth rates with confidence intervals or statistical tests comparing HealDA-initialized vs. ERA5-initialized trajectories, to confirm the difference is not significant given the larger initial errors.
Authors: We appreciate this request for quantitative rigor. We have added fitted exponential growth rates (with 95% confidence intervals obtained via bootstrap resampling) to the error-growth subsection. For each model and variable, the growth rates from HealDA and ERA5 initializations are statistically indistinguishable (two-sample t-test on bootstrap replicates, p > 0.05). The confidence intervals overlap substantially, confirming that the larger initial error, rather than altered growth, accounts for the skill gap. These results and the associated statistical tests are now reported explicitly. revision: yes
-
Referee: [Spectral analysis section] The spectral analysis attributing initial errors to overfitting on large scales and upper-tropospheric fields is used to explain the skill gap; the manuscript should specify the exact spectral bands, variables, and quantitative metric (e.g., power spectrum ratio or scale-dependent RMSE) used to identify this overfitting and demonstrate it is not an artifact of the chosen verification window.
Authors: We have expanded the spectral analysis section with the requested details. We compute the power-spectrum ratio (HealDA/ERA5) integrated over zonal wavenumber bands 1-10 (large scales) and 11-50 (mesoscales) for variables Z500, T850, and U200. Overfitting is identified by excess power ratios >1.2 in the large-scale band and upper-tropospheric levels. To rule out verification-window artifacts, we repeated the analysis over five independent 10-day windows spanning different seasons; the scale-dependent excess remains consistent. These specifications and robustness checks are now stated explicitly in the text and figure captions. revision: yes
Circularity Check
No circularity: purely empirical evaluation
full rationale
The manuscript presents HealDA as a trained ML mapping from observations to analysis state, then reports direct empirical comparisons of initialized forecasts against ERA5 and ECMWF ensembles using standard skill metrics. No equations, uniqueness theorems, or derivations are invoked; all headline claims (effective lead-time loss <1 day, unchanged error growth) are measured outcomes on held-out data rather than quantities forced by construction from fitted parameters or self-citations. Verification sensitivity is acknowledged but does not alter the non-circular status of the reported measurements.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption ERA5 reanalysis serves as a reliable verification target for medium-range skill
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
HealDA consists of two main components: an observation encoder followed by an HPX vision transformer (ViT) backbone... trained jointly end-to-end under a single supervised regression objective.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We find that forecast error growth in these models is unchanged from HealDA initialization, and the skill gap primarily arises from the larger initial error of the HealDA analysis.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Towards accurate extreme event likelihoods from diffusion model climate emulators
Diffusion model climate emulators provide probability density estimates that allow likelihood calculations and odds-ratio-based importance sampling for extreme events such as tropical cyclones.
Reference graph
Works this paper leans on
-
[1]
Deterministic nonperiodic flow.J
Edward N Lorenz. Deterministic nonperiodic flow.J. Atmos. Sci., 20(2):130–141, March 1963. ISSN 0022-4928. doi: 10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2. 1
-
[2]
Learning skillful medium-range global weather forecasting.Science, 382(6677):1416–1421, 2023
Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, et al. Learning skillful medium-range global weather forecasting.Science, 382(6677):1416–1421, 2023. 1, 2, 3
work page 2023
-
[3]
Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, and Qi Tian. Pangu-weather: A 3D high-resolution model for fast and accurate global weather forecast.arXiv preprint arXiv:2211.02556,
-
[4]
ClimaX: A foundation model for weather and climate.arXiv [cs.LG], January 2023
Tung Nguyen, Johannes Brandstetter, Ashish Kapoor, Jayesh K Gupta, and Aditya Grover. ClimaX: A foundation model for weather and climate.arXiv [cs.LG], January 2023. 1 19 HealDA: Highlighting the importance of initial errors in end-to-end AI weather forecasts
work page 2023
-
[5]
N D Brenowitz and C S Bretherton. Prognostic validation of a neural network unified physics parameteriza- tion.Geophysicak Research Letters, 17:2493, June 2018. ISSN 0094-8276. doi: 10.1029/2018GL078510. 1
-
[6]
Jonathan A Weyn, Dale R Durran, and Rich Caruana. Can machines learn to predict weather? using deep learning to predict gridded 500-hPa geopotential height from historical weather data.J. Adv. Model. Earth Syst., 11(8):2680–2693, August 2019. ISSN 1942-2466,1942-2466. doi: 10.1029/2019MS001705. 1
-
[7]
Kang Chen, Tao Han, Fenghua Ling, Junchao Gong, Lei Bai, Xinyu Wang, Jing-Jia Luo, Ben Fei, Wenlong Zhang, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, Yuanzheng Ci, Bin Li, Xiaokang Yang, and Wanli Ouyang. The operational medium-range deterministic weather forecasting can be extended beyond a 10-day lead time.Commun. Earth Environ., 6(1):518, July 2025. ...
-
[8]
Brenowitz, Yair Cohen, Jaideep Pathak, Ankur Mahesh, Boris Bonev, Thorsten Kurth, Dale R
Noah D. Brenowitz, Yair Cohen, Jaideep Pathak, Ankur Mahesh, Boris Bonev, Thorsten Kurth, Dale R. Durran, Peter Harrington, and Michael S. Pritchard. A practical probabilistic benchmark for ai weather models.Geophysical Research Letters, 52(7), April 2025. ISSN 1944-8007. doi: 10.1029/2024gl113656. URLhttp://dx.doi.org/10.1029/2024GL113656. 2, 3
-
[9]
WeatherBench 2: A benchmark for the next generation of data-driven global weather models.J
Stephan Rasp, Stephan Hoyer, Alexander Merose, Ian Langmore, Peter Battaglia, Tyler Russell, Alvaro Sanchez-Gonzalez, Vivian Yang, Rob Carver, Shreya Agrawal, Matthew Chantry, Zied Ben Bouallegue, Peter Dueben, Carla Bromberg, Jared Sisk, Luke Barrington, Aaron Bell, and Fei Sha. WeatherBench 2: A benchmark for the next generation of data-driven global we...
-
[10]
Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R Andersson, Andrew El-Kadi, Dominic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, Remi Lam, and Matthew Willson. Gencast: Diffusion-based ensemble forecasting for medium-range weather.arXiv preprint arXiv:2312.15796, 2023. 2, 3
-
[11]
Dmitrii Kochkov, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Griffin Mooers, Milan Klöwer, James Lottes, Stephan Rasp, Peter Düben, Sam Hatfield, Peter Battaglia, Alvaro Sanchez- Gonzalez, Matthew Willson, Michael P. Brenner, and Stephan Hoyer. Neural general circulation models for weather and climate.Nature, 632(8027):1060–1066, July 2024. IS...
-
[12]
European Centre for Medium-Range Weather Forecasts. Observations and data assimilation.https: //www.ecmwf.int/en/research/data-assimilation/observations, 2023. Accessed: 2026-01-08. 2
work page 2023
-
[13]
Florence Rabier, H Järvinen, E. Klinker, J.-F. Mahfouf, and Adrian Simmons. The ecmwf operational implementation of four dimensional variational assimilation. part i: Experimental results with simplified physics, 02/1999 1999. URLhttps://www.ecmwf.int/node/11794. 2
work page 1999
-
[14]
Buizza, Magdalena Alonso Balmaseda, Andrew Brown, S
R. Buizza, Magdalena Alonso Balmaseda, Andrew Brown, S. J. English, Richard Forbes, Alan Geer, T. Haiden, Martin Leutbecher, Linus Magnusson, Mark Rodwell, M. Sleigh, Tim Stockdale, Frédéric Vitart, and N. Wedi. The development and evaluation process followed at ecmwf to upgrade the integrated forecasting system (ifs). ECMWF Techni- cal Memorandum No. 829...
work page 2018
-
[15]
End-to-enddata-drivenweatherprediction.Nature, 641:1172–1179,
A.Allen, S.Markou, W.Tebbutt, etal. End-to-enddata-drivenweatherprediction.Nature, 641:1172–1179,
-
[16]
URL https://doi.org/10.1038/s41586-025-08897-0
doi: 10.1038/s41586-025-08897-0. URL https://doi.org/10.1038/s41586-025-08897-0. Published online: 20 March 2025; Version of record: 21 May 2025. 2, 3, 5, 8
-
[17]
ZekunNi, JonathanWeyn,HangZhang, YanfeiXiang, JiangBian,WeixinJin, KitThambiratnam, QiZhang, Haiyu Dong, and Hongyu Sun. Huracan: A skillful end-to-end data-driven system for ensemble data assimilation and weather prediction, 2025. URLhttps://arxiv.org/abs/2508.18486. 2, 3, 5, 8, 9 20 HealDA: Highlighting the importance of initial errors in end-to-end AI ...
-
[18]
Wuxin Wang, Weicheng Ni, Lilan Huang, Tao Hao, Ben Fei, Shuo Ma, Taikang Yuan, Yanlai Zhao, Kefeng Deng, Xiaoyong Li, Boheng Duan, Lei Bai, and Kaijun Ren. Xichen: An observation-scalable fully ai-driven global weather forecasting system with 4D variational knowledge, 2025. URLhttps: //arxiv.org/abs/2507.09202. 2, 3, 8
-
[19]
X. Sun, X. Zhong, X. Xu, et al. A data-to-forecast machine learning system for global weather.Nature Communications, 16:6658, 2025. doi: 10.1038/s41467-025-62024-1. URLhttps://doi.org/10.1038/ s41467-025-62024-1. Published online: 19 July 2025. 2, 3, 5, 8, 12
-
[20]
Boris Bonev, Thorsten Kurth, Ankur Mahesh, Mauro Bisson, Jean Kossaifi, Karthik Kashinath, Anima Anandkumar, William D. Collins, Michael S. Pritchard, and Alexander Keller. Fourcastnet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale, 2025. URLhttps://arxiv. org/abs/2507.12144. 2, 3
-
[21]
Cristian Bodnar, Wessel P. Bruinsma, Ana Lucic, Megan Stanley, Anna Allen, Johannes Brandstetter, Patrick Garvan, Maik Riechert, Jonathan A. Weyn, Haiyu Dong, Jayesh K. Gupta, Kit Thambiratnam, Alexander T. Archibald, Chun-Chieh Wu, Elizabeth Heider, Max Welling, Richard E. Turner, and Paris Perdikaris. A foundation model for the earth system.Nature, May ...
-
[22]
Jaideep Pathak, Shashank Subramanian, Peter Harrington, Sanjeev Raja, Ashesh Chattopadhyay, Morteza Mardani, Thorsten Kurth, David Hall, Zongyi Li, Kamyar Azizzadenesheli, Pedram Hassanzadeh, Karthik Kashinath, and Animashree Anandkumar. Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators, 2022. URLhttps...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[23]
Fengwu: Pushing the skillful global medium-range weather forecast beyond 10 days lead, 2023
Kang Chen, Tao Han, Junchao Gong, Lei Bai, Fenghua Ling, Jing-Jia Luo, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, Yuanzheng Ci, Bin Li, Xiaokang Yang, and Wanli Ouyang. Fengwu: Pushing the skillful global medium-range weather forecast beyond 10 days lead, 2023. URLhttps://arxiv.org/abs/2304. 02948. 3
work page 2023
-
[24]
Score-based data assimilation, 2023
François Rozet and Gilles Louppe. Score-based data assimilation, 2023. URLhttps://arxiv.org/abs/ 2306.10574. 3
-
[25]
Generative data assimilation of sparse weather station observations at kilometer scales, 2025
Peter Manshausen, Yair Cohen, Peter Harrington, Jaideep Pathak, Mike Pritchard, Piyush Garg, Morteza Mardani, Karthik Kashinath, Simon Byrne, and Noah Brenowitz. Generative data assimilation of sparse weather station observations at kilometer scales, 2025. URLhttps://arxiv.org/abs/2406.16947. 3
-
[26]
Langwen Huang, Lukas Gianinazzi, Yuejiang Yu, Peter D. Dueben, and Torsten Hoefler. Diffda: a diffusion model for weather-scale data assimilation, 2024. URLhttps://arxiv.org/abs/2401.05932. 3
-
[27]
Appa: Bending weather dynamics with latent diffusion models for global data assimilation, 2025
Gérôme Andry, Sacha Lewin, François Rozet, Omer Rochman, Victor Mangeleer, Matthias Pirlet, Elise Faulx, Marilaure Grégoire, and Gilles Louppe. Appa: Bending weather dynamics with latent diffusion models for global data assimilation, 2025. URLhttps://arxiv.org/abs/2504.18720. 3
-
[28]
Lo-sda: Latent optimization for score-based atmospheric data assimilation, 2025
Jing-An Sun, Hang Fan, Junchao Gong, Ben Fei, Kun Chen, Fenghua Ling, Wenlong Zhang, Wanghan Xu, Li Yan, Pierre Gentine, and Lei Bai. Lo-sda: Latent optimization for score-based atmospheric data assimilation, 2025. URLhttps://arxiv.org/abs/2510.22562. 3
-
[29]
Data driven weather forecasts trained and initialised directly from observations, 2024
Anthony McNally, Christian Lessig, Peter Lean, Eulalie Boucher, Mihai Alexe, Ewan Pinnington, Matthew Chantry, Simon Lang, Chris Burrows, Marcin Chrust, Florian Pinault, Ethel Villeneuve, Niels Bormann, and Sean Healy. Data driven weather forecasts trained and initialised directly from observations, 2024. URLhttps://arxiv.org/abs/2407.15586. 4
-
[30]
Tony McNally, Christian Lessig, Peter Lean, Eulalie Boucher, Mihai Alexe, Ewan Pinning- ton, Patrick Laloyaux, Simon Lang, Florian Pinault, Matt Chantry, Chris Burrows, Ethel Villeneuve, Marcin Chrust, Niels Bormann, and Sean Healy. An update on ai–dop: skil- ful weather forecasts produced directly from observations.ECMWF Newsletter, (182): 15–18, 2025. d...
-
[31]
Junchao Gong, Jingyi Xu, Ben Fei, Fenghua Ling, Wenlong Zhang, Kun Chen, Wanghan Xu, Weidong Yang, Xiaokang Yang, and Lei Bai. Dawp: A framework for global observation forecasting via data assimilation and weather prediction in satellite observation space, 2025. URLhttps://arxiv.org/abs/2510.15978. 4
-
[32]
Thomas Haiden, Matthieu Chevallier, and David Richardson. Forecast performance of the ecmwf opera- tional forecasting system in 2022.ECMWF Newsletter, (175):5–12, 2023. 5
work page 2022
-
[33]
Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz-Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, et al. The era5 global reanalysis.Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020. 8, 12, 17
work page 1999
-
[34]
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
Michael S Albergo, Nicholas M Boffi, and Eric Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions.arXiv [cs.LG], October 2025. doi: 10.48550/arXiv.2303.08797. 8
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2303.08797 2025
-
[35]
Noah D Brenowitz, Tao Ge, Akshay Subramaniam, Aayush Gupta, David M Hall, Morteza Mardani, Arash Vahdat, Karthik Kashinath, and Michael S Pritchard. Climate in a bottle: Towards a generative foundation model for the kilometer-scale global atmosphere.arXiv [physics.ao-ph], May 2025. URL https://arxiv.org/abs/2505.06474. 9, 13
-
[36]
SamudrACE: Fast and accurate coupled climate modeling with 3D ocean and atmosphere emulators
James P C Duncan, Elynn Wu, Surya Dheeshjith, Adam Subel, Troy Arcomano, Spencer K Clark, Brian Henn, Anna Kwa, Jeremy McGibbon, W Andre Perkins, William Gregory, Carlos Fernandez-Granda, Julius Busecke, Oliver Watt-Meyer, William J Hurlin, Alistair Adcroft, Laure Zanna, and Christopher Bretherton. SamudrACE: Fast and accurate coupled climate modeling wit...
-
[37]
Oliver Watt-Meyer, Brian Henn, Jeremy McGibbon, Spencer K Clark, Anna Kwa, W Andre Perkins, Elynn Wu, Lucas Harris, and Christopher S Bretherton. ACE2: Accurately learning subseasonal to decadal atmospheric variability and forced responses.arXiv [physics.ao-ph], November 2024. 9
work page 2024
-
[38]
Clark, Brian Henn, James Duncan, Noah D
Oliver Watt-Meyer, Gideon Dresdner, Jeremy McGibbon, Spencer K. Clark, Brian Henn, James Duncan, Noah D. Brenowitz, Karthik Kashinath, Michael S. Pritchard, Boris Bonev, Matthew E. Peters, and Christopher S. Bretherton. Ace: A fast, skillful learned global atmospheric model for climate prediction,
- [39]
-
[40]
K. M. Gorski, E. Hivon, A. J. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke, and M. Bartelmann. Healpix: A framework for high-resolution discretization and fast analysis of data distributed on the sphere. The Astrophysical Journal, 622(2):759–771, April 2005. ISSN 1538-4357. doi: 10.1086/427976. URL http://dx.doi.org/10.1086/427976. 13
work page internal anchor Pith review doi:10.1086/427976 2005
-
[41]
Matthias Karlbauer, Nathaniel Cresswell-Clay, Dale R. Durran, Raul A. Moreno, Thorsten Kurth, Boris Bonev, Noah Brenowitz, and Martin V. Butz. Advancing parsimonious deep learning weather prediction using the healpix mesh, 2024. URLhttps://arxiv.org/abs/2311.06253. 13
-
[42]
Diffusers: State-of-the-art diffusion models.https://github.com/huggingface/diffusers, 2022
Patrick von Platen, Suraj Patil, Anton Lozhkov, Pedro Cuenca, Nathan Lambert, Kashif Rasul, Mishig Davaadorj, Dhruv Nair, Sayak Paul, William Berman, Yiyi Xu, Steven Liu, and Thomas Wolf. Diffusers: State-of-the-art diffusion models.https://github.com/huggingface/diffusers, 2022. 16
work page 2022
-
[43]
Scalable Diffusion Models with Transformers
William Peebles and Saining Xie. Scalable diffusion models with transformers, 2023. URLhttps: //arxiv.org/abs/2212.09748. 16
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [45]
-
[46]
Decoupled weight decay regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations, 2019. URLhttps://openreview.net/forum?id=Bkg6RiCqY7. 17
work page 2019
-
[47]
Deep Networks with Stochastic Depth
Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kilian Q. Weinberger. Deep networks with stochastic depth.arXiv preprint arXiv:1603.09382, 2016. doi: 10.48550/arXiv.1603.09382. 17 22 HealDA: Highlighting the importance of initial errors in end-to-end AI weather forecasts
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1603.09382 2016
-
[48]
Weatherbench 2: A benchmark for the next generation of data-driven global weather models, 2024
Stephan Rasp, Stephan Hoyer, Alexander Merose, Ian Langmore, Peter Battaglia, Tyler Russell, Alvaro Sanchez-Gonzalez, Vivian Yang, Rob Carver, Shreya Agrawal, Matthew Chantry, Zied Ben Bouallegue, Peter Dueben, Carla Bromberg, Jared Sisk, Luke Barrington, Aaron Bell, and Fei Sha. Weatherbench 2: A benchmark for the next generation of data-driven global we...
-
[49]
TheGlobalEnsembleForecastSystem(version13)Replaydataset
NOAA. TheGlobalEnsembleForecastSystem(version13)Replaydataset. NOAAOpenDataDissemination Program. Available at: https://psl.noaa.gov/data/ufs_replay/, 2024. URL https://psl.noaa. gov/data/ufs_replay/. Subset used: January 2000 – December 2023. Accessed: December 20 2025. 18
work page 2024
-
[50]
Sean Healy, Niels Bormann, Alan Geer, Elias Holm, Bruce Ingleby, Katie Lean, Katrin Lonitz, and Cristina Lupu. Methods for assessing the impact of current and future components of the global observing system, 04/2024 2024. URL . 18
work page 2024
-
[51]
Ascat wind data processing manual
KNMI and OSI SAF and EUMETSAT. Ascat wind data processing manual. Technical report, KNMI, 2009. URL https://scatterometer.knmi.nl/old_manuals/ss3_pm_ascat_1.0.pdf. Accessed: 2025-12-01. 19
work page 2009
-
[52]
Active techniques in wind observations: Scatterometer,
ECMWF. Active techniques in wind observations: Scatterometer,
-
[53]
URL https://www.ecmwf.int/sites/default/files/elibrary/2015/ 8918-active-techniques-wind-observations-scatterometer.pdf . Accessed: 2025-12-01. 19
work page 2015
-
[54]
Atmospheric motion vectors: Past, present and future
Mary Forsythe. Atmospheric motion vectors: Past, present and future. Technical re- port, ECMWF / Met Office Seminar on Recent Developments in Use of Satellite Obser- vations in NWP, 2008. URL https://www.ecmwf.int/sites/default/files/elibrary/2008/ 74512-atmospheric-motion-vectors-past-present-and-future_0.pdf . ECMWF Seminar on Satel- lite Observations i...
work page 2008
-
[55]
Gps radio occultation lecture notes, 2015
ECMWF. Gps radio occultation lecture notes, 2015. URLhttps://www.ecmwf.int/sites/default/ files/gpsro_lecture_2015_nwpsaf.pdf. ECMWF / NWPSAF training material. 19
work page 2015
-
[56]
Earth2studio: Open-source deep-learning framework for ai weather/climate workflows
NickGeneva and the NVIDIA Earth2Studio Team. Earth2studio: Open-source deep-learning framework for ai weather/climate workflows. URLhttps://github.com/NVIDIA/earth2studio/releases/tag/ 0.9.0. 19
-
[57]
Michaël Zamo and Philippe Naveau. Estimation of the continuous ranked probability score with limited information and applications to ensemble weather forecasts.Mathematical Geosciences, 50 (2):209–234, February 2018. doi: 10.1007/s11004-017-9709-7. URL https://doi.org/10.1007/ s11004-017-9709-7. 24
-
[58]
Strictly proper scoring rules, prediction, and estimation.J
Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation.J. Am. Stat. Assoc., 102(477):359–378, March 2007. ISSN 0162-1459. doi: 10.1198/016214506000001437. 24
-
[59]
World Meteorological Organization. Wmo integrated processing and prediction system activities – part ii: Specifications of wmo integrated processing and prediction system activities. Wmo-no. 485, World Meteorological Organization, 2023. URLhttps://library.wmo.int/idurl/4/35703. Part II: Specifications of WMO Integrated Processing and Prediction System Act...
work page 2023
-
[60]
ECMWF.IFSDocumentationCY48R1–PartV:EnsemblePredictionSystem. Number5inIFSDocumentation. European Centre for Medium-Range Weather Forecasts, 2023. doi: 10.21957/e529074162. 26
-
[61]
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows, 2021. URLhttps://arxiv.org/ abs/2103.14030. 29 23 HealDA: Highlighting the importance of initial errors in end-to-end AI weather forecasts 0 24 48 72 96 120 144 168 192 216 240 Lead Time (hour...
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[62]
Red dotted lines mark reference thresholds (ACC = 0.6; SSR = 1). 28 HealDA: Highlighting the importance of initial errors in end-to-end AI weather forecasts 0 48 96 144 192 240 Lead Time (hours) 0 150 300 450 600 FCN3 RMSE [m² s ²] a Z500 0 48 96 144 192 240 Lead Time (hours) 0.0 0.8 1.6 2.4 3.2 [K]b T850 0 48 96 144 192 240 Lead Time (hours) 0 2 4 6 8 [m...
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.