pith. machine review for the scientific record. sign in

arxiv: 2603.14845 · v3 · submitted 2026-03-16 · 💻 cs.LG · cs.AI

Recognition: 1 theorem link

· Lean Theorem

Integrating Weather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting

Authors on Pith no claims yet

Pith reviewed 2026-05-15 10:36 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords solar irradiance forecastingweather foundation modelsatellite imagery fusioncloud cover predictionfine-grained forecastingmultimodal integrationday-ahead solar prediction
0
0 comments X

The pith

A two-stage fusion of weather foundation model forecasts and satellite imagery produces kilometer-scale solar irradiance predictions for the next 24 hours.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Baguan-solar, a multimodal framework that combines outputs from the global weather foundation model Baguan with high-resolution geostationary satellite data. Its decoupled design first generates day-night continuous fields such as cloud cover from the foundation model, then infers irradiance while preserving fine-scale cloud structures through joint modality fusion. This addresses the coarse resolution of numerical weather predictions and the limited lead-time accuracy of satellite extrapolation alone. Over East Asia, using CLDAS as ground truth, the approach reduces forecast RMSE by 16.08 percent relative to strong baselines including ECMWF IFS and prior satellite methods.

Core claim

Baguan-solar employs a decoupled two-stage multimodal architecture: Baguan first produces continuous intermediate forecasts including cloud cover, after which irradiance is inferred by fusing those fields with high-resolution satellite imagery to retain kilometer-scale cloud details while respecting large-scale weather constraints.

What carries the argument

Decoupled two-stage multimodal fusion that first predicts continuous intermediates from the weather foundation model and then jointly incorporates satellite imagery for irradiance inference.

If this is right

  • Delivers 24-hour solar irradiance forecasts at kilometer resolution suitable for day-ahead power grid scheduling.
  • Improves resolution of transient cloud effects on irradiance compared to global numerical weather models.
  • Supports operational solar power forecasting as demonstrated by deployment in an eastern Chinese province.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same two-stage fusion pattern could be tested on other cloud-sensitive variables such as precipitation or surface temperature.
  • Regional fine-tuning of the satellite component might further reduce errors in areas with distinct cloud regimes.
  • Extending the intermediates to include additional variables like aerosol optical depth could broaden applicability to air-quality-linked solar attenuation.

Load-bearing premise

The fusion step accurately preserves fine-scale cloud structures from satellite data without adding systematic bias when deriving irradiance values from the predicted cloud fields.

What would settle it

Direct pixel-level comparison of the model's inferred cloud cover fields against independent high-resolution satellite observations or ground-based measurements over a multi-day period would reveal any consistent spatial biases in the resulting irradiance forecasts.

Figures

Figures reproduced from arXiv: 2603.14845 by Cong Bai, Haifan Zhang, Kai Ying, Liang Sun, Peisong Niu, Tianyu Zhu, Tian Zhou, Xinyue Gu, Zheng Wang, Ziqing Ma.

Figure 1
Figure 1. Figure 1: From coarse Baguan forecasts to fine-grained GHI: the Baguan-solar overall framework. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Baguan-solar model architecture. Baguan-solar uses a two-stage Swin Transformer framework that fuses Himawari [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: RMSE of GHI forecasts as a function of lead time [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative comparison of GHI forecast fields for two representative cases initialized at UTC 00:00 and 12:00. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Importance of ERA5 vs Satellites across lead times. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

Accurate day-ahead solar irradiance forecasting is essential for integrating solar energy into the power grid. However, it remains challenging due to the pronounced diurnal cycle and inherently complex cloud dynamics. Current methods either lack fine-scale resolution (e.g., numerical weather prediction, weather foundation models) or degrade at longer lead times (e.g., satellite extrapolation). We propose Baguan-solar, a two-stage multimodal framework that fuses forecasts from Baguan, a global weather foundation model, with high-resolution geostationary satellite imagery to produce 24-hour irradiance forecasts at kilometer scale. Its decoupled two-stage design first forecasts day-night continuous intermediates (e.g., cloud cover) and then infers irradiance, while its modality fusion jointly preserves fine-scale cloud structures from satellite and large-scale constraints from Baguan forecasts. Evaluated over East Asia using CLDAS as ground truth, Baguan-solar outperforms strong baselines (including ECMWF IFS, vanilla Baguan, and SolarSeer), reducing RMSE by 16.08% and better resolving cloud-induced transients. An operational deployment of Baguan-solar has supported solar power forecasting in an eastern province in China, since July 2025. Our code is accessible at https://github.com/DAMO-DI-ML/Baguan-solar.git.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes Baguan-solar, a two-stage multimodal framework for 24-hour solar irradiance forecasting at kilometer scale. It first generates day-night continuous intermediates (e.g., cloud cover) from the Baguan global weather foundation model, then fuses these forecasts with high-resolution geostationary satellite imagery to infer irradiance. Evaluated over East Asia against CLDAS ground truth, it reports a 16.08% RMSE reduction relative to baselines including ECMWF IFS, vanilla Baguan, and SolarSeer, with improved resolution of cloud-induced transients, and notes an operational deployment supporting solar power forecasting in an eastern Chinese province since July 2025. Code is released at a public GitHub repository.

Significance. If the central fusion mechanism is shown to preserve fine-scale satellite cloud structures without bias, the work would offer a practical advance in solar forecasting by combining large-scale constraints from weather foundation models with local satellite detail. The reported performance gain, explicit baseline comparisons, and real-world operational use would strengthen its relevance for grid integration of solar energy, while the open code supports reproducibility.

major comments (1)
  1. [§3] §3 (Method, two-stage fusion description): The decoupled design claims that modality fusion 'jointly preserves fine-scale cloud structures from satellite and large-scale constraints from Baguan forecasts,' yet no quantitative validation is provided (e.g., cloud-mask IoU, power-spectrum comparison of cloud fields, or per-pixel bias maps against raw satellite imagery). This check is load-bearing for attributing the 16.08% RMSE reduction and improved transient resolution to true fine-scale preservation rather than large-scale Baguan skill alone.
minor comments (2)
  1. [Abstract] Abstract: The operational deployment statement ('since July 2025') should include the exact start date, duration, and any performance metrics from the live system to allow assessment of real-world impact.
  2. [Experiments] Experiments section: The RMSE comparisons would be strengthened by reporting error bars, number of test periods, or statistical significance tests for the 16.08% reduction.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comment below and have revised the manuscript to incorporate additional quantitative validation as suggested.

read point-by-point responses
  1. Referee: [§3] §3 (Method, two-stage fusion description): The decoupled design claims that modality fusion 'jointly preserves fine-scale cloud structures from satellite and large-scale constraints from Baguan forecasts,' yet no quantitative validation is provided (e.g., cloud-mask IoU, power-spectrum comparison of cloud fields, or per-pixel bias maps against raw satellite imagery). This check is load-bearing for attributing the 16.08% RMSE reduction and improved transient resolution to true fine-scale preservation rather than large-scale Baguan skill alone.

    Authors: We agree that direct quantitative validation of fine-scale preservation is important to rigorously attribute the performance gains to the multimodal fusion rather than Baguan skill alone. In the revised manuscript, we have added a new analysis subsection in §3 that includes: (i) cloud-mask IoU scores between the fused intermediates and raw geostationary satellite imagery, (ii) power-spectral-density comparisons of cloud fields to quantify retention of high-frequency spatial structures, and (iii) per-pixel bias maps against satellite observations. These metrics confirm that the decoupled fusion preserves satellite-derived fine-scale cloud details while respecting large-scale constraints from Baguan, thereby supporting the reported 16.08% RMSE reduction and improved transient resolution. revision: yes

Circularity Check

0 steps flagged

No significant circularity; performance claims rest on external validation against CLDAS

full rationale

The paper presents an empirical ML framework whose central claims (16.08% RMSE reduction and improved transient resolution) are measured directly against an external ground-truth dataset (CLDAS) and compared to named public baselines (ECMWF IFS, vanilla Baguan, SolarSeer). The two-stage decoupled fusion architecture is described as a design choice that takes Baguan intermediates as input and fuses them with satellite imagery; no equation or derivation step reduces the reported irradiance output to a fitted parameter or self-citation by construction. No self-definitional loops, fitted-input-as-prediction patterns, or load-bearing uniqueness theorems appear in the provided text. The evaluation remains falsifiable outside the model's own fitted values.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that Baguan's intermediate cloud-cover forecasts are sufficiently accurate to serve as large-scale constraints and that CLDAS provides unbiased kilometer-scale ground truth; no free parameters or invented physical entities are described in the abstract.

axioms (2)
  • domain assumption Baguan foundation model produces usable day-night continuous cloud-cover intermediates
    Invoked in the description of the first stage of the two-stage framework.
  • domain assumption CLDAS reanalysis constitutes accurate ground truth for irradiance at kilometer scale
    Used as the evaluation benchmark over East Asia.

pith-pipeline@v0.9.0 · 5555 in / 1391 out tokens · 31542 ms · 2026-05-15T10:36:48.956345+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

  1. [1]

    Weyn, Haiyu Dong, Bin Zhang, Hongyu Sun, Kit Thambiratnam, Qi Zhang, Hongbin Sun, Xuan Zhang, and Qiuwei Wu

    Mingliang Bai, Zuliang Fang, Shengyu Tao, Siqi Xiang, Jiang Bian, Yanfei Xiang, Pengcheng Zhao, Weixin Jin, Jonathan A. Weyn, Haiyu Dong, Bin Zhang, Hongyu Sun, Kit Thambiratnam, Qi Zhang, Hongbin Sun, Xuan Zhang, and Qiuwei Wu

  2. [2]

    https://doi.org/10.1016/j.xcrp.2025.102996

    Ultrafast 24-h solar irradiance forecasts outperform numerical weather predictions across the USA.Cell Reports Physical Science6, 12 (2025), 102996. https://doi.org/10.1016/j.xcrp.2025.102996

  3. [3]

    Kotaro BESSHO, Kenji DATE, Masahiro HAYASHI, Akio IKEDA, Takahito IMAI, Hidekazu INOUE, Yukihiro KUMAGAI, Takuya MIYAKAWA, Hidehiko MURATA, Tomoo OHNO, Arata OKUYAMA, Ryo OYAMA, Yukio SASAKI, Yoshio SHIMAZU, Kazuki SHIMOJI, Yasuhiko SUMIDA, Masuo SUZUKI, Hidetaka TANIGUCHI, Hi- roaki TSUCHIYAMA, Daisaku UESAWA, Hironobu YOKOTA, and Ryo YOSHIDA

  4. [4]

    An Introduction to Himawari-8/9 mdash; Japan rsquo;s New-Generation Geostationary Meteorological Satellites.Journal of the Meteorological Society of Japan. Ser. II94, 2 (2016), 151–183. https://doi.org/10.2151/jmsj.2016-009

  5. [5]

    Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, and Qi Tian. 2023. Accurate medium-range global weather forecasting with 3D neural networks. Nature619, 7970 (2023), 533–538

  6. [6]

    Oussama Boussif, Ghait Boukachab, Dan Assouline, Stefano Massaroli, Tianle Yuan, Loubna Benabbou, and Yoshua Bengio. 2023. Improving *day-ahead* Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 20...

  7. [7]

    Alberto Carpentieri, Jussi Leinonen, Jeff Adie, Boris Bonev, Doris Folini, and Farah Hariri. 2024. Data-driven Surface Solar Irradiance Estimation using Neural Operators at Global Scale.arXiv preprint arXiv:2411.08843(2024)

  8. [8]

    Kang Chen, Tao Han, Junchao Gong, Lei Bai, Fenghua Ling, Jing-Jia Luo, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, et al. 2023. Fengwu: Pushing the skillful global medium-range weather forecast beyond 10 days lead.arXiv preprint arXiv:2304.02948(2023)

  9. [9]

    Lei Chen, Xiaohui Zhong, Feng Zhang, Yuan Cheng, Yinghui Xu, Yuan Qi, and Hao Li. 2023. FuXi: a cascade machine learning forecasting system for 15-day global weather forecast.npj Climate and Atmospheric Science6, 1 (2023), 190. https://doi.org/10.1038/s41612-023-00512-1

  10. [10]

    Cale Colony and Razan Andigani. 2024. Solarcast-ML: Per Node GraphCast Extension for Solar Energy Production.arXiv preprint arXiv:2406.13559(2024)

  11. [11]

    Aaron Defazio, Xingyu Yang, Harsh Mehta, Konstantin Mishchenko, Ahmed Khaled, and Ashok Cutkosky. 2024. The Road Less Scheduled. arXiv:2405.15682 [cs.LG]

  12. [12]

    Michael Emmanuel, Kate Doubleday, Burcin Cakir, Marija Marković, and Bri- Mathias Hodge. 2020. A review of power system planning and operational models for flexibility assessment in high solar energy penetration scenarios.Solar Energy 210 (2020), 169–180. https://doi.org/10.1016/j.solener.2020.07.017 Special Issue on Grid Integration

  13. [13]

    Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz-Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, et al. 2020. The ERA5 global reanalysis.Quarterly Journal of the Royal Meteoro- logical Society146, 730 (2020), 1999–2049

  14. [14]

    William F Holmgren, Clifford W Hansen, and Mark A Mikofski. 2018. pvlib python: A python package for modeling solar energy systems.Journal of Open Source Software3, 29 (2018), 884

  15. [15]

    Qiusheng Huang, Xiaohui Zhong, Xu Fan, Lei Chen, and Hao Li. 2025. FuXi- RTM: A Physics-Guided Prediction Framework with Radiative Transfer Model- ing.CoRRabs/2503.19940 (2025). https://doi.org/10.48550/ARXIV.2503.19940 arXiv:2503.19940

  16. [16]

    Pierre Ineichen and Richard Perez. 2002. A new airmass independent formulation for the Linke turbidity coefficient.Solar Energy73, 3 (2002), 151–157

  17. [17]

    Japan Meteorological Agency and Meteorological Satellite Center. [n. d.]. Uti- lization of Meteorological Satellite Data in Cloud Analysis. MSC Tech- nical Note (Special Issue). https://www.data.jma.go.jp/mscweb/technotes/ UtilizationMetSatData.pdf Accessed: 2026-01-27

  18. [18]

    Thorsten Kurth, Shashank Subramanian, Peter Harrington, Jaideep Pathak, Morteza Mardani, David Hall, Andrea Miele, Karthik Kashinath, and Anima Anandkumar. 2023. FourCastNet: Accelerating Global High-Resolution Weather Forecasting Using Adaptive Fourier Neural Operators. InProceedings of the Plat- form for Advanced Scientific Computing Conference, PASC 20...

  19. [19]

    Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton- Rosen, Weihua Hu, Alexander Merose, Stephan Hoyer, George Holland, Oriol Vinyals, Jacklynn Stott, Alexander Pritzel, Shakir Mohamed, and Peter Battaglia

  20. [20]

    Science382(6677), 1416–1421 (2023) https://doi.org/10.1126/science.adi2336

    Learning skillful medium-range global weather forecasting.Sci- ence382, 6677 (2023), 1416–1421. https://doi.org/10.1126/science.adi2336 arXiv:https://www.science.org/doi/pdf/10.1126/science.adi2336

  21. [21]

    Simon Lang, Mihai Alexe, Matthew Chantry, Jesper Dramsch, Florian Pinault, Baudouin Raoult, Mariana CA Clare, Christian Lessig, Michael Maier-Gerber, Linus Magnusson, et al. 2024. AIFS–ECMWF’s data-driven forecasting system. arXiv preprint arXiv:2406.01465(2024)

  22. [22]

    Francisco JL Lima, Fernando R Martins, Enio B Pereira, Elke Lorenz, and Detlev Heinemann. 2016. Forecast for surface solar irradiance at the Brazilian Northeast- ern region using NWP model and artificial neural networks.Renewable Energy 87 (2016), 807–818

  23. [23]

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE, 9992–10002. https://doi.org/10.1109/ICCV48922.2021.00986

  24. [24]

    Ziqing Ma, Wenwei Wang, Tian Zhou, Chao Chen, Bingqing Peng, Liang Sun, and Rong Jin. 2024. FusionSF: Fuse Heterogeneous Modalities in a Vector Quan- tized Framework for Robust Solar Power Forecasting. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024, Barcelona, Spain, August 25-29, 2024, Ricardo Baeza-Yates...

  25. [25]

    S Karthik Mukkavilli, Daniel Salles Civitarese, Johannes Schmude, Johannes Jaku- bik, Anne Jones, Nam Nguyen, Christopher Phillips, Sujit Roy, Shraddha Singh, Campbell Watson, et al. 2023. Ai foundation models for weather and climate: Applications, design, and implementation.arXiv preprint arXiv:2309.10808(2023)

  26. [26]

    Gupta, and Aditya Grover

    Tung Nguyen, Johannes Brandstetter, Ashish Kapoor, Jayesh K. Gupta, and Aditya Grover. 2023. ClimaX: A foundation model for weather and climate. InInterna- tional Conference on Machine Learning. https://api.semanticscholar.org/CorpusID: 256231457

  27. [27]

    Peisong Niu, Ziqing Ma, Tian Zhou, Weiqi Chen, Lefei Shen, Rong Jin, and Liang Sun. 2025. Utilizing Strategic Pre-training to Reduce Overfitting: Baguan - A Pre-trained Weather Forecasting Model. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.2, KDD 2025, Toronto ON, Canada, August 3-7, 2025, Luiza Antonie, Jian ...

  28. [28]

    Stephan Rasp, Stephan Hoyer, Alexander Merose, Ian Langmore, Peter Battaglia, Tyler Russel, Alvaro Sanchez-Gonzalez, Vivian Yang, Rob Carver, Shreya Agrawal, Matthew Chantry, Zied Ben Bouallegue, Peter Dueben, Carla Bromberg, Jared Sisk, Luke Barrington, Aaron Bell, and Fei Sha. 2024. WeatherBench 2: A benchmark for the next generation of data-driven glob...

  29. [29]

    Baptiste Schubnel, Jelena Simeunović, Corentin Tissier, Pierre-Jean Alet, and Rafael E Carrillo. 2025. SolarCrossFormer: Improving day-ahead Solar Irradi- ance Forecasting by Integrating Satellite Imagery and Ground Sensors.IEEE Transactions on Sustainable Energy(2025)

  30. [30]

    Chunxiang Shi, Lipeng Jiang, Tao Zhang, Dongbin Zhang, Bin Xu, Xiao Liang, and Chen Zhu. 2013. China Land Data Assimilation System (CLDAS) Research and Op- eration. In6th WMO Symposium on Data Assimilation. https://www.cscamm.umd. edu/programs/das6/program/Posters/abs/Fp13-Shi_Chunxiang.pdf Accessed: 2026-01-27

  31. [31]

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic Attribution for Deep Networks. InProceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 3319–3328. https://proceedings.mlr.press/v70/ sundararajan17a.html

  32. [32]

    Jie Wang, Wenping Qin, Ruipeng Lu, Wenbo Zhang, Haixiao Zhu, and Anting Zhao. 2024. Short-term PV power forecasting system Based on GraphCast Weather Forecasting Model. In2024 IEEE 8th Conference on Energy Internet and Energy System Integration (EI2). IEEE, 5331–5335

  33. [33]

    James L Warner, Jon Petch, Chris J Short, and Caroline Bain. 2023. Assessing the impact of a NWP warm-start system on model spin-up over tropical Africa. Quarterly Journal of the Royal Meteorological Society149, 751 (2023), 621–636

  34. [34]

    Gueymard, Tao Hong, Jan Kleissl, Jing Huang, Marc J

    Dazhi Yang, Wenting Wang, Christian A. Gueymard, Tao Hong, Jan Kleissl, Jing Huang, Marc J. Perez, Richard Perez, Jamie M. Bright, Xiang’ao Xia, Dennis van der Meer, and Ian Marius Peters. 2022. A review of solar forecasting, its depen- dence on atmospheric sciences and implications for grid integration: Towards carbon neutrality.Renewable and Sustainable...

  35. [35]

    Xiaohui Zhong, Lei Chen, Xu Fan, Wenxu Qian, Jun Liu, and Hao Li. 2024. FuXi-2.0: Advancing machine learning weather forecasting model for practical applications.CoRRabs/2409.07188 (2024). https://doi.org/10.48550/ARXIV.2409. 07188 arXiv:2409.07188

  36. [36]

    Zhaoyang Zhu, Weiqi Chen, Rui Xia, Tian Zhou, Peisong Niu, Bingqing Peng, Wenwei Wang, Hengbo Liu, Ziqing Ma, Qingsong Wen, et al. 2023. eForecaster: Unifying electricity forecasting with robust, flexible, and explainable machine learning algorithms. InProceedings of the AAAI Conference on Artificial Intelli- gence, Vol. 37. 15630–15638. Conference acrony...