SPDEBench: An Extensive Benchmark for Learning Stochastic PDEs
Pith reviewed 2026-05-22 01:42 UTC · model grok-4.3
The pith
SPDE-aware architectures outperform generic operator learners on a new benchmark for stochastic PDEs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SPDEBench supplies controlled datasets for regular and singular SPDEs along with baselines and metrics, and evaluations under controlled data variations demonstrate that SPDE-aware architectures generally achieve stronger performance than generic operator-learning baselines on accuracy, robustness, and out-of-distribution generalization.
What carries the argument
SPDEBench benchmark, which generates and standardizes datasets for regular and singular SPDEs using specific noise approximations, basis choices, and renormalization procedures, then evaluates models with Sobolev and distributional metrics.
Load-bearing premise
The chosen noise approximations, basis choices, and renormalization procedures produce datasets representative enough for reliable model ranking across the broader space of singular SPDEs.
What would settle it
A generic operator-learning model that matches or exceeds SPDE-aware models on accuracy and robustness across a wider collection of singular SPDEs generated with alternate noise approximations and renormalization schemes.
Figures
read the original abstract
Stochastic Partial Differential Equations (SPDEs) driven by random noise play a central role in modeling physical processes with rough spatio-temporal dynamics, such as turbulence flows, superconductors, and quantum dynamics. Although machine learning (ML)-based surrogate models have shown promise for efficiently approximating such dynamics, progress remains limited by the lack of a unified benchmark with controlled data generation and comprehensive evaluation. This gap is particularly significant for singular SPDEs, for which benchmark datasets are largely unavailable and reliable simulation requires numerically delicate schemes based on renormalization. Moreover, subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets and, consequently, model evaluation. We introduce SPDEBench, the first unified benchmark for ML-based SPDE learning. SPDEBench provides ready-to-use datasets for physically and mathematically significant SPDEs on 1-3D domains with periodic or Dirichlet boundary condition. Both regular and singular SPDEs are taken into consideration. SPDEBench also incorporates representative ML baselines in operator learning, together with 7 evaluation metrics, including Sobolev and distributional metrics beyond the standard $L^2$-error. Supported by SPDEBench, we conduct systematic evaluations of model accuracy, robustness, and out-of-distribution generalization under controlled data variations. Our numerical results show that SPDE-aware architectures generally achieve stronger performance than generic operator-learning baselines. These findings establish SPDEBench as a reproducible and extensible resource, paving pathway for principled benchmarking and architecture design for stochastic spatio-temporal dynamics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SPDEBench, the first unified benchmark for machine learning surrogate models of stochastic partial differential equations (SPDEs). It supplies ready-to-use datasets for both regular and singular SPDEs on 1-3D domains with periodic or Dirichlet boundary conditions, generated under controlled variations in noise approximation, basis choice, and renormalization. The benchmark incorporates representative operator-learning baselines and evaluates them with seven metrics that include Sobolev and distributional measures in addition to standard L2 error. Systematic experiments under these controlled variations lead to the finding that SPDE-aware architectures generally outperform generic baselines on accuracy, robustness, and out-of-distribution generalization.
Significance. If the reported rankings prove stable, SPDEBench would constitute a valuable, reproducible resource that fills a clear gap in standardized evaluation for stochastic spatio-temporal modeling, particularly for singular SPDEs whose simulation requires delicate renormalization. The provision of controlled data variations, multiple metrics beyond L2, and emphasis on extensibility are genuine strengths that could guide future architecture development in scientific machine learning.
major comments (1)
- Abstract: The manuscript explicitly states that 'subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets' for singular SPDEs, yet no sensitivity analysis is presented to verify that the reported performance advantages of SPDE-aware architectures remain stable under reasonable alternative choices of these procedures. This directly affects the reliability of the central comparative claim.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation of SPDEBench and for the constructive major comment. We address the concern regarding sensitivity analysis below.
read point-by-point responses
-
Referee: Abstract: The manuscript explicitly states that 'subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets' for singular SPDEs, yet no sensitivity analysis is presented to verify that the reported performance advantages of SPDE-aware architectures remain stable under reasonable alternative choices of these procedures. This directly affects the reliability of the central comparative claim.
Authors: We agree that explicitly verifying the stability of performance rankings under alternative data-generation choices strengthens the reliability of the central claims. While the manuscript already reports systematic evaluations of accuracy, robustness, and out-of-distribution generalization under controlled variations in noise approximation, basis choice, and renormalization (detailed in Sections 3–5 and the associated figures), these experiments do not include a dedicated sensitivity study that perturbs the procedures beyond the primary controlled settings and re-evaluates the relative ordering of SPDE-aware versus generic architectures. In the revised manuscript we will add such a sensitivity analysis, consisting of additional experiments with reasonable alternative noise discretizations, basis expansions, and renormalization thresholds for the singular SPDEs. This will directly confirm (or qualify) the stability of the reported advantages. revision: yes
Circularity Check
No significant circularity in empirical benchmark evaluation
full rationale
The paper introduces SPDEBench as an empirical benchmark with generated datasets and comparative model evaluations on accuracy, robustness, and OOD generalization. Its central claim rests on numerical results from these datasets rather than any mathematical derivation or prediction that reduces to fitted inputs or self-citations by construction. No load-bearing steps match the enumerated circularity patterns; the work is self-contained as a reproducible resource for external validation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard numerical schemes based on renormalization are required and sufficient for generating reliable singular SPDE datasets.
Reference graph
Works this paper leans on
-
[1]
H. Bahouri, J.-Y. Chemin, and R. Danchin. Fourier Analysis and Nonlinear Partial Differential Equations. Springer Berlin Heidelberg, 2011
work page 2011
-
[2]
I. Chevyrev, A. Gerasimovičs, and H. Weber. Feature engineering with regularity structures. Journal of Scientific Computing, 98(1):13, 2024
work page 2024
-
[3]
G. Da Prato and J. Zabczyk. Stochastic equations in infinite dimensions, volume 152. Cambridge university press, 2014
work page 2014
-
[4]
S. Gong, P. Hu, Q. Meng, Y. Wang, R. Zhu, B. Chen, Z. Ma, H. Ni, and T.-Y. Liu. Deep latent regularity network for modeling stochastic partial differential equations. InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 7740–7747, 2023
work page 2023
-
[5]
M. Gubinelli and M. Hofmanová. Global solutions to elliptic and parabolicΦ4 models in euclidean space. Communications in Mathematical Physics, 368(3):1201–1266, 2019
work page 2019
-
[6]
M. Gubinelli, P. Imkeller, and N. Perkowski. Paracontrolled distributions and singular PDEs. In Forum of Mathematics, Pi, volume 3, page e6. Cambridge University Press, 2015
work page 2015
-
[7]
M. Hairer. Solving the KPZ equation.Annals of mathematics, pages 559–664, 2013
work page 2013
-
[8]
M. Hairer. A theory of regularity structures.Inventiones mathematicae, 198(2):269–504, 2014
work page 2014
- [9]
-
[10]
J. R. Klauder. Stochastic quantization. In Recent developments in high-energy physics, pages 251–281. Springer, 1983
work page 1983
-
[11]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anand- kumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[12]
G. J. Lord, C. E. Powell, and T. Shardlow.An introduction to computational stochastic PDEs, volume 50. Cambridge University Press, 2014. 12
work page 2014
-
[13]
L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature machine intelligence, 3(3):218–229, 2021
work page 2021
-
[14]
J. Morrill, C. Salvi, P. Kidger, and J. Foster. Neural rough differential equations for long time series. In International Conference on Machine Learning, pages 7829–7838. PMLR, 2021
work page 2021
-
[15]
A. Neufeld and P. Schmocker. Solving stochastic partial differential equations using neural networks in the wiener chaos expansion.arXiv preprint arXiv:2411.03384, 2024
- [16]
-
[17]
SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs
H. Triebel.Theory of Function Spaces III, volume 100. Birkhäuser Basel, 2006. 13 This is theSupplementary Materials for "SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs". A Basic Concepts in SPDE Theory A.1 Space-time white noise In most cases, the noise termξ = ξ(t, x) is introduced to represent a so-called space-time ...
work page 2006
-
[18]
W (0) = 0 almost surely
-
[19]
W (t; ω) is a continuous sample trajectoryR+ 7→ H, for eachω ∈ Ω
-
[20]
W (t) is Ft-adapted and W (t) − W (s) is independent ofFs for s < t
-
[21]
/Phi42+_expl_xi_eps_2_1200.parquet
W (t) − W (s) ∼ N (0, (t − s)Q) for all 0 ≤ s ≤ t. In analogy to the Karhunen–Loève expansion, it can be shown thatW (t) is a Q-Wiener process if and only if for allt ≥ 0, W (t) = ∞X j=1 p λjϕjβj(t), 14 where βj(t) are i.i.d. Brownian motions, and the series converges inL2(Ω, H). Moreover, the series is P-a.s. uniformly convergent on[0, T] for arbitrary T...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.