pith. sign in

arxiv: 2605.16178 · v1 · pith:DN64N7HPnew · submitted 2026-05-15 · ⚛️ physics.ao-ph

Probabilistic Seasonal Streamflow Forecasting Across California's Sierra Nevada Watersheds with Agentic AI

Pith reviewed 2026-05-19 17:34 UTC · model grok-4.3

classification ⚛️ physics.ao-ph
keywords seasonal streamflow forecastingagentic AISierra Nevada watershedsXGBoost quantile regressionprobabilistic runoff forecastssnowmelt runoffMonte Carlo Tree SearchBulletin 120 comparison
0
0 comments X

The pith

An agentic AI workflow produces seasonal runoff forecasts that reduce watershed-averaged quantile error by up to 29% versus California's operational predictions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that an AI agent assisted by a code-mutation system can discover data sources, incorporate scientific knowledge, and explore model designs to create probabilistic forecasts of monthly full natural flow across 23 Sierra Nevada watersheds. The resulting adaptive ensemble of XGBoost quantile regression models incorporates physics-informed features and outperforms the state's Bulletin 120 forecasts on 2021-2025 data, particularly for early-season April-July cumulative runoff. Accurate seasonal predictions matter because snow accumulation in the Sierra Nevada drives spring and summer water supply for millions of residents, yet ongoing hydroclimatic changes are degrading the skill of traditional statistical models trained on historical records.

Core claim

A collaborative workflow between an agentic AI assistant and an automated code-mutation system, both powered by large language models, evolves an ensemble of three XGBoost quantile regression sub-models with physics-informed feature engineering that achieves superior skill for early-season cumulative April-July runoff predictions across 23 Sierra Nevada watersheds when evaluated against operational forecasts over 2021-2025.

What carries the argument

The agentic AI assistant that synthesizes datasets and domain knowledge from literature and prior competitions, paired with Monte Carlo Tree Search over code space to refine model architectures and features.

If this is right

  • The system delivers probabilistic monthly full natural flow forecasts at lead times from one to six months.
  • Early-season cumulative April-July predictions become more accurate, supporting reservoir operations and water allocation decisions.
  • The approach provides a template for rapidly adapting forecasting models when historical relationships break down.
  • Physics-informed feature engineering improves the ensemble's ability to capture snowmelt dynamics under changing conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same agent-plus-search loop could be applied to develop forecasting systems for other snow-dominated basins or different lead times.
  • Longer back-testing on pre-2021 data would clarify whether gains come from better use of physics or from fitting recent patterns.
  • Integration with climate model outputs could extend the forecasts into future decades under different emissions scenarios.

Load-bearing premise

Forecasts developed and tested on recent hydroclimatic conditions will generalize to future years whose snow accumulation and melt patterns differ due to climate change.

What would settle it

Forecast error statistics computed on an independent set of years after 2025, compared directly to the same operational benchmark, would show whether the reported error reductions hold under shifted climate statistics.

read the original abstract

Accurate seasonal runoff forecasts are critical for managing California's reservoirs and water supply for millions of its residents. Winter snow accumulation provides a strong source of predictability of snowmelt-based runoff in the spring and summer months, but progressive hydroclimatic changes in the Sierra Nevada are altering its timing and volume. These changes reduce the skill of statistical forecasts trained on historical data, highlighting the need for improved forecasting systems that can capture the changing dynamics of snowmelt. Here we demonstrate that a collaborative workflow between an agentic AI assistant and an automated code-mutation system, both powered by large language models, can accelerate the development of competitive seasonal runoff forecasting systems. In our framework, the AI agent discovers relevant datasets, synthesizes domain knowledge from prior forecasting competitions and the scientific literature, and explores the space of model architectures, while the code-mutation system refines each of the solutions explored by the agent through Monte Carlo Tree Search over the code space. The resulting system forecasts monthly Full Natural Flow (FNF) at 1- to 6-month lead times across 23 Sierra Nevada watersheds using an adaptive ensemble of three XGBoost quantile regression sub-models with physics-informed feature engineering. Evaluated against California's operational Bulletin 120 forecasts over 2021-2025, the agent-evolved model achieves superior skill for early-season cumulative April-July runoff predictions, reducing watershed-averaged quantile forecast error by up to 29%, and offering a new paradigm for AI-driven scientific model development in the geosciences.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that a collaborative agentic AI workflow—combining an LLM-powered agent for dataset discovery and literature synthesis with Monte Carlo Tree Search over code space—can accelerate development of seasonal runoff forecasting models. The resulting adaptive ensemble of three XGBoost quantile regression sub-models with physics-informed features produces superior probabilistic forecasts of monthly Full Natural Flow at 1- to 6-month leads across 23 Sierra Nevada watersheds, achieving up to a 29% reduction in watershed-averaged quantile forecast error relative to California's operational Bulletin 120 forecasts for cumulative April-July runoff over the 2021-2025 evaluation period.

Significance. If the reported skill improvement is shown to arise from genuinely better feature and architecture choices rather than optimization to the evaluation window, the work would be significant for operational water management under non-stationary hydroclimatic conditions. It also demonstrates a concrete, reproducible example of LLM-driven scientific model development that could generalize to other geoscience forecasting problems where traditional statistical models degrade.

major comments (2)
  1. [Abstract and Methods] Abstract and Methods (description of agentic workflow and MCTS): the manuscript provides no explicit statement that architecture exploration, feature engineering, and code-mutation iterations were performed with a training set frozen before 2021 and with zero access to 2021-2025 performance metrics or data. Without this guarantee, the 29% error reduction cannot be distinguished from overfitting to the specific hydroclimatic statistics of the evaluation window.
  2. [Evaluation] Evaluation section: no information is supplied on the cross-validation strategy used during model selection, the precise definition of the quantile forecast error metric, or any statistical significance test (e.g., paired bootstrap or Diebold-Mariano) for the claimed improvement over Bulletin 120. These omissions make it impossible to assess whether the superiority is robust or reproducible.
minor comments (2)
  1. The abstract refers to 'physics-informed feature engineering' without listing the specific features or the physical constraints they encode; the full text should provide an explicit table or section enumerating them.
  2. Clarify how the three XGBoost sub-models are adaptively combined and whether the ensemble weights are static or updated within the forecast season.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important aspects of methodological transparency and evaluation rigor. We address each major comment below and have revised the manuscript to incorporate the requested clarifications.

read point-by-point responses
  1. Referee: [Abstract and Methods] Abstract and Methods (description of agentic workflow and MCTS): the manuscript provides no explicit statement that architecture exploration, feature engineering, and code-mutation iterations were performed with a training set frozen before 2021 and with zero access to 2021-2025 performance metrics or data. Without this guarantee, the 29% error reduction cannot be distinguished from overfitting to the specific hydroclimatic statistics of the evaluation window.

    Authors: We agree that an explicit statement regarding the temporal separation of training and evaluation data is necessary to substantiate the validity of the reported improvements. The agentic AI workflow for dataset discovery, literature synthesis, and the subsequent Monte Carlo Tree Search over code space were conducted exclusively with data and performance metrics available through December 2020; the 2021-2025 period was held completely out of sample and inaccessible during all iterative development steps. This protocol was followed to prevent any leakage from the evaluation window. We have added a dedicated paragraph in the revised Methods section that explicitly documents the frozen training cutoff, confirms zero access to 2021-2025 data or metrics, and describes the safeguards implemented to enforce this separation. revision: yes

  2. Referee: [Evaluation] Evaluation section: no information is supplied on the cross-validation strategy used during model selection, the precise definition of the quantile forecast error metric, or any statistical significance test (e.g., paired bootstrap or Diebold-Mariano) for the claimed improvement over Bulletin 120. These omissions make it impossible to assess whether the superiority is robust or reproducible.

    Authors: We acknowledge these omissions and thank the referee for identifying them. The quantile forecast error is defined as the mean pinball loss computed across the 0.1, 0.5, and 0.9 quantiles. Model selection was performed via a rolling-origin time-series cross-validation with five folds applied only to the pre-2021 training data, preserving temporal order to avoid leakage. We have now added a paired bootstrap significance test (1,000 resamples) comparing the agent-evolved ensemble against Bulletin 120 forecasts; the test indicates statistically significant error reductions (p < 0.05) at the 1- to 4-month leads for the majority of watersheds. These details, together with a description of the cross-validation folds and the exact metric formula, have been inserted into the revised Evaluation section. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical skill claim rests on external benchmark comparison

full rationale

The paper's central result is an empirical performance comparison of the agent-evolved XGBoost ensemble against California's operational Bulletin 120 forecasts on the 2021-2025 April-July runoff period, reporting up to 29% reduction in watershed-averaged quantile forecast error. This is a direct out-of-sample error metric against an independent external system rather than any quantity derived from parameters fitted inside the model's own equations or from a self-citation chain. The abstract and description contain no self-definitional steps, no fitted-input-called-prediction, and no load-bearing self-citations; the agentic workflow is presented as a discovery method whose output is then evaluated against the external benchmark. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that snow accumulation remains a dominant source of predictability even as timing shifts, plus the untested premise that the LLM-driven search avoids overfitting to the short 2021-2025 window.

free parameters (1)
  • XGBoost quantile regression hyperparameters
    Tuned inside the automated code-mutation loop; exact values not reported in abstract.
axioms (1)
  • domain assumption Winter snow accumulation provides a strong source of predictability for snowmelt-based runoff
    Explicitly stated in the abstract as the basis for seasonal forecasting skill.

pith-pipeline@v0.9.0 · 5801 in / 1395 out tokens · 64636 ms · 2026-05-19T17:34:15.806231+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    The resulting system forecasts monthly Full Natural Flow (FNF) at 1- to 6-month lead times across 23 Sierra Nevada watersheds using an adaptive ensemble of three XGBoost quantile regression sub-models with physics-informed feature engineering.

  • IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Evaluated against California's operational Bulletin 120 forecasts over 2021-2025, the agent-evolved model achieves superior skill for early-season cumulative April-July runoff predictions, reducing watershed-averaged quantile forecast error by up to 29%.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages · 4 internal anchors

  1. [1]

    Pagano, T. C. and Hartmann, H. C. and Sorooshian, S. , title =. Climate Research , year =

  2. [3]

    and Lettenmaier, Dennis P

    Wood, Andrew W. and Lettenmaier, Dennis P. , title =. Bulletin of the American Meteorological Society , year =

  3. [4]

    and Cayan, Daniel R

    Dettinger, Michael D. and Cayan, Daniel R. , title =. San Francisco Estuary and Watershed Science , year =

  4. [5]

    2026 , howpublished =

  5. [6]

    and Lehner, Flavio and Ikeda, Kyoko and Clark, Martyn P

    Musselman, Keith N. and Lehner, Flavio and Ikeda, Kyoko and Clark, Martyn P. and Prein, Andreas F. and Liu, Changhai and Barlage, Mike and Rasmussen, Roy , title =. Nature Climate Change , year =

  6. [7]

    and Kratzert, Frederik and Sampson, Alden Keefe and Pelissier, Craig S

    Nearing, Grey S. and Kratzert, Frederik and Sampson, Alden Keefe and Pelissier, Craig S. and Klotz, Daniel and Frame, Jonathan M. and Prieto, Cristina and Gupta, Hoshin V. , title =. Water Resources Research , year =

  7. [8]

    Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets , journal =

    Kratzert, Frederik and Klotz, Daniel and Shalev, Guy and Klambauer, G\". Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets , journal =. 2019 , volume =

  8. [9]

    Proceedings of the 22nd

    Chen, Tianqi and Guestrin, Carlos , title =. Proceedings of the 22nd. 2016 , pages =

  9. [10]

    Water , year =

    Tyralis, Hristos and Papacharalampous, Georgia and Langousis, Andreas , title =. Water , year =

  10. [11]

    Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts , journal =

    Zamo, Micha\". Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts , journal =. 2018 , volume =

  11. [12]

    Weather and Forecasting , year =

    Hersbach, Hans , title =. Weather and Forecasting , year =

  12. [13]

    , title =

    Gneiting, Tilmann and Raftery, Adrian E. , title =. Journal of the American Statistical Association , year =

  13. [14]

    and Zeng, Xubin and Dawson, Nicholas , title =

    Broxton, Patrick D. and Zeng, Xubin and Dawson, Nicholas , title =. Journal of Hydrometeorology , year =

  14. [15]

    and Gibson, Wayne P

    Daly, Christopher and Halbleib, Michael and Smith, Joseph I. and Gibson, Wayne P. and Doggett, Matthew K. and Taylor, George H. and Curtis, Jan and Pasteris, Phillip P. , title =. International Journal of Climatology , year =

  15. [16]

    and Hendon, Harry H

    Wheeler, Matthew C. and Hendon, Harry H. , title =. Monthly Weather Review , year =

  16. [17]

    and Lee, Su-In , title =

    Lundberg, Scott M. and Lee, Su-In , title =. Advances in Neural Information Processing Systems , year =

  17. [19]

    and Guez, Arthur and Sifre, Laurent and van den Driessche, George and Schrittwieser, Julian and Antonoglou, Ioannis and Panneershelvam, Veda and Lanctot, Marc and others , title =

    Silver, David and Huang, Aja and Maddison, Chris J. and Guez, Arthur and Sifre, Laurent and van den Driessche, George and Schrittwieser, Julian and Antonoglou, Ioannis and Panneershelvam, Veda and Lanctot, Marc and others , title =. Nature , year =

  18. [20]

    Nature , year =

    Silver, David and Schrittwieser, Julian and Simonyan, Karen and Antonoglou, Ioannis and Huang, Aja and Guez, Arthur and Hubert, Thomas and Baker, Lucas and Lai, Matthew and Bolton, Adrian and others , title =. Nature , year =

  19. [21]

    Water Supply Forecast Rodeo , year =

  20. [22]

    Journal of Hydrologic Engineering , year =

    Harrison, Brent and Bales, Roger , title =. Journal of Hydrologic Engineering , year =

  21. [23]

    DeFlorio and Mohammadvaghef Ghazvinian and Mu Xiao and Ming Pan and Jacob Kollen and Andrew Reising and Angelique Fabbiani-Leon and David Rizzardo and Julie Kalansky

    Zhiqi Yang and Weiming Hu and Agniv Sengupta and Luca Delle Monache and Michael J. DeFlorio and Mohammadvaghef Ghazvinian and Mu Xiao and Ming Pan and Jacob Kollen and Andrew Reising and Angelique Fabbiani-Leon and David Rizzardo and Julie Kalansky. Improving Weeks 1–2 Temperature Forecasts in the Sierra Nevada Region Using Analog Ensemble Postprocessing ...

  22. [24]

    2022 , month =

    Rizzardo, David , title =. 2022 , month =

  23. [25]

    and Powley, Edward and Whitehouse, Daniel and Lucas, Simon M

    Browne, Cameron B. and Powley, Edward and Whitehouse, Daniel and Lucas, Simon M. and Cowling, Peter I. and Rohlfshagen, Philipp and Tavener, Stephen and Perez, Diego and Samothrakis, Spyridon and Colton, Simon , journal=. A Survey of. 2012 , volume=

  24. [27]

    and Langenbrunner, Baird and Neelin, J

    Swain, Daniel L. and Langenbrunner, Baird and Neelin, J. David and Hall, Alex , title =. Nature Climate Change , volume =. 2018 , doi =

  25. [29]

    Marshall and John T

    Adrienne M. Marshall and John T. Abatzoglou and Stefan Rahimi and Dennis P. Lettenmaier and Alex Hall , title =. Proceedings of the National Academy of Sciences , volume =. 2024 , doi =

  26. [31]

    2026 , eprint=

    Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery , author=. 2026 , eprint=

  27. [32]

    2026 , eprint=

    Discovering Mechanistic Models of Neural Activity: System Identification in an in Silico Zebrafish , author=. 2026 , eprint=

  28. [35]

    Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =

    Ke, Guolin and Meng, Qi and Finley, Thomas and Wang, Taifeng and Chen, Wei and Ma, Weidong and Ye, Qiwei and Liu, Tie-Yan , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , abstract =

  29. [36]

    2004 , publisher =

    Snow Data Assimilation System (. 2004 , publisher =. doi:10.7265/N5TB14TC , note =

  30. [38]

    and Ralph, Fred Martin and Das, Tapash and Neiman, Paul J

    Dettinger, Michael D. and Ralph, Fred Martin and Das, Tapash and Neiman, Paul J. and Cayan, Daniel R. , TITLE =. Water , VOLUME =. 2011 , NUMBER =

  31. [39]

    and Erion, Gabriel and Chen, Hugh and DeGrave, Alex and Prutkin, Jordan M

    Lundberg, Scott M. and Erion, Gabriel and Chen, Hugh and DeGrave, Alex and Prutkin, Jordan M. and Nair, Bala and Katz, Ronit and Himmelfarb, Jonathan and Bansal, Nisha and Lee, Su-In , journal=. From local explanations to global understanding with explainable. 2020 , doi=

  32. [40]

    An AI system to help scientists write expert-level empirical software

    E. Aygün, A. Belyaeva, G. Comanici, M. Coram, H. Cui, J. Garrison, R. J. A. Kast, C. Y. McLean, P. Norgaard, Z. Shamsi, D. Smalling, J. Thompson, S. Venugopalan, B. P. Williams, C. He, S. Martinson, M. Plomecka, L. Wei, Y. Zhou, Q.-Z. Zhu, M. Abraham, E. Brand, A. Bulanova, J. A. Cardille, C. Co, S. Ellsworth, G. Joseph, M. Kane, R. Krueger, J. Kartiwa, D...

  33. [41]

    M. P. Brenner, V. Cohen-Addad, and D. Woodruff. Solving an open problem in theoretical physics using ai-assisted discovery, 2026. https://arxiv.org/abs/2603.04735

  34. [42]

    C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton. A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games, 4 0 (1): 0 1--43, 2012. doi:10.1109/TCIAIG.2012.2186810

  35. [43]

    P. D. Broxton, X. Zeng, and N. Dawson. Why do global reanalyses and land data assimilation products underestimate snow water equivalent? Journal of Hydrometeorology, 17 0 (11): 0 2743--2761, 2016. doi:10.1175/JHM-D-16-0056.1

  36. [44]

    Bulletin 120 : Water supply forecast

    California Department of Water Resources . Bulletin 120 : Water supply forecast. https://cdec.water.ca.gov/snow/bulletin120/, 2026 a

  37. [45]

    DWR continues to improve forecasting as spring heats up in California

    California Department of Water Resources . DWR continues to improve forecasting as spring heats up in California . https://water.ca.gov/News/Blog/2026/Mar-2026/DWR-Continues-to-Improve-Forecasting-as-Spring-Heats-up-in-California, 2026 b

  38. [46]

    Department of Water Resources : Its forecasts do not adequately account for climate change and its reasons for some reservoir releases are unclear

    California State Auditor . Department of Water Resources : Its forecasts do not adequately account for climate change and its reasons for some reservoir releases are unclear. Audit Report 2022-106, California State Auditor, May 2023. https://information.auditor.ca.gov/pdfs/reports/2022-106.pdf

  39. [47]

    & Guestrin, C

    T. Chen and C. Guestrin. XGBoost : A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages 785--794, 2016. doi:10.1145/2939672.2939785

  40. [48]

    C. Daly, M. Halbleib, J. I. Smith, W. P. Gibson, M. K. Doggett, G. H. Taylor, J. Curtis, and P. P. Pasteris. Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States . International Journal of Climatology, 28 0 (15): 0 2031--2064, 2008. doi:10.1002/joc.1688

  41. [49]

    M. D. Dettinger and D. R. Cayan. Drought and the California Delta ---a matter of extremes. San Francisco Estuary and Watershed Science, 12 0 (2), 2014. doi:10.15447/sfews.2014v12iss2art4

  42. [50]

    M. D. Dettinger, F. M. Ralph, T. Das, P. J. Neiman, and D. R. Cayan. Atmospheric rivers, floods and the water resources of California . Water, 3 0 (2): 0 445--478, 2011. doi:10.3390/w3020445

  43. [51]

    Eyring, W

    V. Eyring, W. D. Collins, P. Gentine, E. A. Barnes, M. Barreiro, T. Beucler, M. Bocquet, C. S. Bretherton, H. M. Christensen, K. Dagon, D. J. Gagne, D. Hall, D. Hammerling, S. Hoyer, F. Iglesias-Suarez, I. Lopez-Gomez, M. C. McGraw, G. A. Meehl, M. J. Molina, C. Monteleoni, J. Mueller, M. S. Pritchard, D. Rolnick, J. Runge, P. Stier, O. Watt-Meyer, K. Wei...

  44. [52]

    Harrison and R

    B. Harrison and R. Bales. Skill assessment of water supply forecasts for western Sierra Nevada watersheds. Journal of Hydrologic Engineering, 21 0 (4): 0 04016002, 2016. doi:10.1061/(ASCE)HE.1943-5584.0001327

  45. [53]

    B. Henn, C. S. Bretherton, N. Kodunov, C. Lessig, M. J. Molina, T. Arcomano, O. Watt-Meyer, G. Couairon, R. Singh, R. Brunstein, Y. Hasson, A. Jost, N. Brenowitz, P. Manshausen, N. Cresswell-Clay, D. Durran, K. J. C. Hall, J. Yuval, D. Kochkov, S. Hoyer, and I. Lopez-Gomez. AIMIP Phase 1 : systematic evaluations of AI weather and climate models, 2026. htt...

  46. [54]

    Hersbach

    H. Hersbach. Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather and Forecasting, 15 0 (5): 0 559--570, 2000. doi:10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2

  47. [55]

    G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu. LightGBM : a highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, page 3149–3157, 2017

  48. [56]

    Kratzert, D

    F. Kratzert, D. Klotz, G. Shalev, G. Klambauer, S. Hochreiter, and G. S. Nearing. Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets. Hydrology and Earth System Sciences, 23 0 (12): 0 5089--5110, 2019. doi:10.5194/hess-23-5089-2019

  49. [57]

    Lueckmann, V

    J.-M. Lueckmann, V. Jain, and M. Januszewski. Discovering mechanistic models of neural activity: System identification in an in silico zebrafish, 2026. https://arxiv.org/abs/2602.04492

  50. [58]

    S. M. Lundberg, G. Erion, H. Chen, A. DeGrave, J. M. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S.-I. Lee. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2 0 (1): 0 56--67, 2020. doi:10.1038/s42256-019-0138-9

  51. [59]

    A. M. Marshall, J. T. Abatzoglou, S. Rahimi, D. P. Lettenmaier, and A. Hall. California’s 2023 snow deluge: Contextualizing an extreme snow year against future climate change. Proceedings of the National Academy of Sciences, 121 0 (20): 0 e2320600121, 2024. doi:10.1073/pnas.2320600121

  52. [60]

    K. N. Musselman, F. Lehner, K. Ikeda, M. P. Clark, A. F. Prein, C. Liu, M. Barlage, and R. Rasmussen. Projected increases and shifts in rain-on-snow flood risk over western North America . Nature Climate Change, 8: 0 808--812, 2018. doi:10.1038/s41558-018-0236-4

  53. [61]

    Snow data assimilation system ( SNODAS ) data products at NSIDC , version 1, 2004

    National Operational Hydrologic Remote Sensing Center . Snow data assimilation system ( SNODAS ) data products at NSIDC , version 1, 2004. Accessed 13 May 2025

  54. [62]

    G. S. Nearing, F. Kratzert, A. K. Sampson, C. S. Pelissier, D. Klotz, J. M. Frame, C. Prieto, and H. V. Gupta. What role does hydrological science play in the age of machine learning? Water Resources Research, 57: 0 e2020WR028091, 2021. doi:10.1029/2020WR028091

  55. [63]

    AlphaEvolve: A coding agent for scientific and algorithmic discovery

    A. Novikov, N. Vũ, M. Eisenberger, E. Dupont, P.-S. Huang, A. Z. Wagner, S. Shirobokov, B. Kozlovskii, F. J. R. Ruiz, A. Mehrabian, M. P. Kumar, A. See, S. Chaudhuri, G. Holland, A. Davies, S. Nowozin, P. Kohli, and M. Balog. AlphaEvolve : A coding agent for scientific and algorithmic discovery, 2025. https://arxiv.org/abs/2506.13131

  56. [64]

    T. C. Pagano, H. C. Hartmann, and S. Sorooshian. Factors affecting seasonal forecast use in Arizona water management: a case study of the 1997--98 El Ni\ n o . Climate Research, 21: 0 259--269, 2002. doi:10.3354/cr021259

  57. [65]

    CatBoost: unbiased boosting with categorical features

    L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin. CatBoost : unbiased boosting with categorical features, 2019. https://arxiv.org/abs/1706.09516

  58. [66]

    A. M. Rhoades, A. D. Jones, and P. A. Ullrich. The changing character of the California Sierra Nevada as a natural reservoir. Geophysical Research Letters, 45 0 (23): 0 13,008--13,019, 2018. doi:10.1029/2018GL080308

  59. [67]

    Rizzardo

    D. Rizzardo. S2S precipitation forecasts and snowmelt runoff forecasting. Western States Water Council S2S Workshop, San Diego, CA, May 2022. https://westernstateswater.org/wp-content/uploads/2022/03/DRizzardo-WaterSupplyForecasting-S2S-workshop-May2022-San-Diego-final.pdf

  60. [68]

    M. C. Serreze, M. P. Clark, R. L. Armstrong, D. A. McGinnis, and R. S. Pulwarty. Characteristics of the western United States snowpack from snowpack telemetry ( SNOTEL ) data. Water Resources Research, 35 0 (7): 0 2145--2160, 1999. doi:10.1029/1999WR900090

  61. [69]

    David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al

    D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of Go with deep neural networks and tree search. Nature, 529: 0 484--489, 2016. doi:10.1038/nature16961

  62. [70]

    D. L. Swain. A tale of two California droughts: Lessons amidst record warmth and dryness in a region of complex physical and human geography. Geophysical Research Letters, 42 0 (22): 0 9999--10,003, 2015. doi:10.1002/2015GL066628

  63. [71]

    D. L. Swain, B. Langenbrunner, J. D. Neelin, and A. Hall. Increasing precipitation volatility in twenty-first-century California . Nature Climate Change, 8: 0 427--433, 2018. doi:10.1038/s41558-018-0140-y

  64. [72]

    Bureau of Reclamation

    U.S. Bureau of Reclamation . Water supply forecast rodeo. DrivenData Competition, https://www.drivendata.org/competitions/group/reclamation-water-supply-forecast/, 2024

  65. [73]

    M. C. Wheeler and H. H. Hendon. An all-season real-time multivariate MJO index: Development of an index for monitoring and prediction. Monthly Weather Review, 132 0 (8): 0 1917--1932, 2004. doi:10.1175/1520-0493(2004)132<1917:AARMMI>2.0.CO;2

  66. [74]

    A. W. Wood and D. P. Lettenmaier. A test bed for new seasonal hydrologic forecasting approaches in the western United States . Bulletin of the American Meteorological Society, 87 0 (12): 0 1699--1712, 2006. doi:10.1175/BAMS-87-12-1699