SPDEBench: An Extensive Benchmark for Learning Stochastic PDEs

Bingguang Chen; Dai Shi; Hao Ni; Jose Miguel Lara Rangel; Luke Thompson; Oliver Nash; Qi Meng; Rongchan Zhu; Siran Li; Yuantu Zhu

arxiv: 2505.18511 · v3 · pith:46O54PYVnew · submitted 2025-05-24 · 💻 cs.LG · math.AP· physics.comp-ph

SPDEBench: An Extensive Benchmark for Learning Stochastic PDEs

Yuantu Zhu , Zheyan Li , Dai Shi , Luke Thompson , Oliver Nash , Jose Miguel Lara Rangel , Siran Li , Bingguang Chen

show 3 more authors

Rongchan Zhu Qi Meng Hao Ni

This is my paper

Pith reviewed 2026-05-22 01:42 UTC · model grok-4.3

classification 💻 cs.LG math.APphysics.comp-ph

keywords stochastic partial differential equationsmachine learning benchmarksoperator learningsingular SPDEssurrogate modelsspatio-temporal dynamicsmodel evaluationrenormalization

0 comments

The pith

SPDE-aware architectures outperform generic operator learners on a new benchmark for stochastic PDEs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SPDEBench as the first unified benchmark supplying ready-to-use datasets for both regular and singular stochastic partial differential equations on one- to three-dimensional domains. It pairs these datasets with representative machine learning baselines and seven evaluation metrics that extend beyond standard L2 error to include Sobolev and distributional measures. Systematic tests under controlled variations in data generation show that models incorporating awareness of SPDE structure deliver stronger accuracy, robustness, and out-of-distribution generalization than generic operator-learning approaches. The benchmark addresses the lack of standardized resources for singular SPDEs, where noise approximations and renormalization affect dataset quality and model rankings. This setup creates a reproducible foundation for comparing and improving surrogate models of rough physical dynamics.

Core claim

SPDEBench supplies controlled datasets for regular and singular SPDEs along with baselines and metrics, and evaluations under controlled data variations demonstrate that SPDE-aware architectures generally achieve stronger performance than generic operator-learning baselines on accuracy, robustness, and out-of-distribution generalization.

What carries the argument

SPDEBench benchmark, which generates and standardizes datasets for regular and singular SPDEs using specific noise approximations, basis choices, and renormalization procedures, then evaluates models with Sobolev and distributional metrics.

Load-bearing premise

The chosen noise approximations, basis choices, and renormalization procedures produce datasets representative enough for reliable model ranking across the broader space of singular SPDEs.

What would settle it

A generic operator-learning model that matches or exceeds SPDE-aware models on accuracy and robustness across a wider collection of singular SPDEs generated with alternate noise approximations and renormalization schemes.

Figures

Figures reproduced from arXiv: 2505.18511 by Bingguang Chen, Dai Shi, Hao Ni, Jose Miguel Lara Rangel, Luke Thompson, Oliver Nash, Qi Meng, Rongchan Zhu, Siran Li, Yuantu Zhu, Zheyan Li.

**Figure 2.** Figure 2: Comparison of FNO and NSPDE trained on KdV datasets with noise generated by [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the time evolution of the NSE( [PITH_FULL_IMAGE:figures/full_fig_p019_3.png] view at source ↗

read the original abstract

Stochastic Partial Differential Equations (SPDEs) driven by random noise play a central role in modeling physical processes with rough spatio-temporal dynamics, such as turbulence flows, superconductors, and quantum dynamics. Although machine learning (ML)-based surrogate models have shown promise for efficiently approximating such dynamics, progress remains limited by the lack of a unified benchmark with controlled data generation and comprehensive evaluation. This gap is particularly significant for singular SPDEs, for which benchmark datasets are largely unavailable and reliable simulation requires numerically delicate schemes based on renormalization. Moreover, subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets and, consequently, model evaluation. We introduce SPDEBench, the first unified benchmark for ML-based SPDE learning. SPDEBench provides ready-to-use datasets for physically and mathematically significant SPDEs on 1-3D domains with periodic or Dirichlet boundary condition. Both regular and singular SPDEs are taken into consideration. SPDEBench also incorporates representative ML baselines in operator learning, together with 7 evaluation metrics, including Sobolev and distributional metrics beyond the standard $L^2$-error. Supported by SPDEBench, we conduct systematic evaluations of model accuracy, robustness, and out-of-distribution generalization under controlled data variations. Our numerical results show that SPDE-aware architectures generally achieve stronger performance than generic operator-learning baselines. These findings establish SPDEBench as a reproducible and extensible resource, paving pathway for principled benchmarking and architecture design for stochastic spatio-temporal dynamics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SPDEBench gives the field its first shared datasets for both regular and singular SPDEs with distributional metrics, but model rankings rest on untested choices in noise and renormalization.

read the letter

SPDEBench is the first benchmark that puts regular and singular SPDEs on the same footing with controlled generation and distributional metrics. That alone makes it worth a look for anyone working on learning stochastic dynamics. The paper supplies ready-to-use datasets for physically relevant SPDEs in one to three dimensions, with both periodic and Dirichlet boundaries. It includes renormalization for the singular cases, which is necessary but numerically tricky. They also bring in standard operator-learning baselines and evaluate on seven metrics that go beyond plain L2 error. Their experiments vary the data generation in controlled ways and track accuracy, robustness, and out-of-distribution performance. The results indicate that architectures designed with SPDE structure in mind tend to beat generic baselines. What stands out is the effort to make the resource reproducible and extensible. They flag how noise approximation, basis choice, and renormalization can change the data, which is honest. The soft spot is that they do not test how sensitive the model rankings are to those same choices. The abstract itself says these details matter a lot for singular SPDEs, yet there is no sensitivity study showing the performance ordering stays the same under reasonable variations. That leaves open whether the claimed advantage for SPDE-aware models is robust or tied to the specific generation pipeline they picked. Reproduction might also require checking the supplementary material for every renormalization step and boundary implementation. This paper is aimed at researchers in scientific machine learning who need standardized test cases for stochastic PDE surrogates. Anyone comparing new architectures or metrics will find the datasets and evaluation setup immediately useful. I would send it to peer review. A benchmark like this can move the field from scattered experiments to something more systematic, and the authors have done the groundwork even if a few questions on stability remain.

Referee Report

1 major / 0 minor

Summary. The paper introduces SPDEBench, the first unified benchmark for machine learning surrogate models of stochastic partial differential equations (SPDEs). It supplies ready-to-use datasets for both regular and singular SPDEs on 1-3D domains with periodic or Dirichlet boundary conditions, generated under controlled variations in noise approximation, basis choice, and renormalization. The benchmark incorporates representative operator-learning baselines and evaluates them with seven metrics that include Sobolev and distributional measures in addition to standard L2 error. Systematic experiments under these controlled variations lead to the finding that SPDE-aware architectures generally outperform generic baselines on accuracy, robustness, and out-of-distribution generalization.

Significance. If the reported rankings prove stable, SPDEBench would constitute a valuable, reproducible resource that fills a clear gap in standardized evaluation for stochastic spatio-temporal modeling, particularly for singular SPDEs whose simulation requires delicate renormalization. The provision of controlled data variations, multiple metrics beyond L2, and emphasis on extensibility are genuine strengths that could guide future architecture development in scientific machine learning.

major comments (1)

Abstract: The manuscript explicitly states that 'subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets' for singular SPDEs, yet no sensitivity analysis is presented to verify that the reported performance advantages of SPDE-aware architectures remain stable under reasonable alternative choices of these procedures. This directly affects the reliability of the central comparative claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive evaluation of SPDEBench and for the constructive major comment. We address the concern regarding sensitivity analysis below.

read point-by-point responses

Referee: Abstract: The manuscript explicitly states that 'subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets' for singular SPDEs, yet no sensitivity analysis is presented to verify that the reported performance advantages of SPDE-aware architectures remain stable under reasonable alternative choices of these procedures. This directly affects the reliability of the central comparative claim.

Authors: We agree that explicitly verifying the stability of performance rankings under alternative data-generation choices strengthens the reliability of the central claims. While the manuscript already reports systematic evaluations of accuracy, robustness, and out-of-distribution generalization under controlled variations in noise approximation, basis choice, and renormalization (detailed in Sections 3–5 and the associated figures), these experiments do not include a dedicated sensitivity study that perturbs the procedures beyond the primary controlled settings and re-evaluates the relative ordering of SPDE-aware versus generic architectures. In the revised manuscript we will add such a sensitivity analysis, consisting of additional experiments with reasonable alternative noise discretizations, basis expansions, and renormalization thresholds for the singular SPDEs. This will directly confirm (or qualify) the stability of the reported advantages. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical benchmark evaluation

full rationale

The paper introduces SPDEBench as an empirical benchmark with generated datasets and comparative model evaluations on accuracy, robustness, and OOD generalization. Its central claim rests on numerical results from these datasets rather than any mathematical derivation or prediction that reduces to fitted inputs or self-citations by construction. No load-bearing steps match the enumerated circularity patterns; the work is self-contained as a reproducible resource for external validation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The benchmark rests on standard numerical schemes for SPDEs (including renormalization for singular cases) and on the choice of seven evaluation metrics; no new physical axioms or invented entities are introduced.

axioms (1)

domain assumption Standard numerical schemes based on renormalization are required and sufficient for generating reliable singular SPDE datasets.
Invoked when the abstract states that reliable simulation of singular SPDEs requires numerically delicate renormalization-based schemes.

pith-pipeline@v0.9.0 · 5840 in / 1236 out tokens · 26265 ms · 2026-05-22T01:42:37.271860+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 1 internal anchor

[1]

Bahouri, J.-Y

H. Bahouri, J.-Y. Chemin, and R. Danchin. Fourier Analysis and Nonlinear Partial Differential Equations. Springer Berlin Heidelberg, 2011

work page 2011
[2]

Chevyrev, A

I. Chevyrev, A. Gerasimovičs, and H. Weber. Feature engineering with regularity structures. Journal of Scientific Computing, 98(1):13, 2024

work page 2024
[3]

Da Prato and J

G. Da Prato and J. Zabczyk. Stochastic equations in infinite dimensions, volume 152. Cambridge university press, 2014

work page 2014
[4]

S. Gong, P. Hu, Q. Meng, Y. Wang, R. Zhu, B. Chen, Z. Ma, H. Ni, and T.-Y. Liu. Deep latent regularity network for modeling stochastic partial differential equations. InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 7740–7747, 2023

work page 2023
[5]

Gubinelli and M

M. Gubinelli and M. Hofmanová. Global solutions to elliptic and parabolicΦ4 models in euclidean space. Communications in Mathematical Physics, 368(3):1201–1266, 2019

work page 2019
[6]

Gubinelli, P

M. Gubinelli, P. Imkeller, and N. Perkowski. Paracontrolled distributions and singular PDEs. In Forum of Mathematics, Pi, volume 3, page e6. Cambridge University Press, 2015

work page 2015
[7]

M. Hairer. Solving the KPZ equation.Annals of mathematics, pages 559–664, 2013

work page 2013
[8]

M. Hairer. A theory of regularity structures.Inventiones mathematicae, 198(2):269–504, 2014

work page 2014
[9]

Kidger, J

P. Kidger, J. Morrill, J. Foster, and T. Lyons. Neural controlled differential equations for irregular time series.Advances in neural information processing systems, 33:6696–6707, 2020

work page 2020
[10]

J. R. Klauder. Stochastic quantization. In Recent developments in high-energy physics, pages 251–281. Springer, 1983

work page 1983
[11]

Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anand- kumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[12]

G. J. Lord, C. E. Powell, and T. Shardlow.An introduction to computational stochastic PDEs, volume 50. Cambridge University Press, 2014. 12

work page 2014
[13]

L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature machine intelligence, 3(3):218–229, 2021

work page 2021
[14]

Morrill, C

J. Morrill, C. Salvi, P. Kidger, and J. Foster. Neural rough differential equations for long time series. In International Conference on Machine Learning, pages 7829–7838. PMLR, 2021

work page 2021
[15]

Neufeld and P

A. Neufeld and P. Schmocker. Solving stochastic partial differential equations using neural networks in the wiener chaos expansion.arXiv preprint arXiv:2411.03384, 2024

work page arXiv 2024
[16]

Salvi, M

C. Salvi, M. Lemercier, and A. Gerasimovics. Neural stochastic PDEs: Resolution-invariant learning of continuous spatiotemporal dynamics.Advancesin Neural Information Processing Systems, 35:1333–1344, 2022

work page 2022
[17]

SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs

H. Triebel.Theory of Function Spaces III, volume 100. Birkhäuser Basel, 2006. 13 This is theSupplementary Materials for "SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs". A Basic Concepts in SPDE Theory A.1 Space-time white noise In most cases, the noise termξ = ξ(t, x) is introduced to represent a so-called space-time ...

work page 2006
[18]

W (0) = 0 almost surely

work page
[19]

W (t; ω) is a continuous sample trajectoryR+ 7→ H, for eachω ∈ Ω

work page
[20]

W (t) is Ft-adapted and W (t) − W (s) is independent ofFs for s < t

work page
[21]

/Phi42+_expl_xi_eps_2_1200.parquet

W (t) − W (s) ∼ N (0, (t − s)Q) for all 0 ≤ s ≤ t. In analogy to the Karhunen–Loève expansion, it can be shown thatW (t) is a Q-Wiener process if and only if for allt ≥ 0, W (t) = ∞X j=1 p λjϕjβj(t), 14 where βj(t) are i.i.d. Brownian motions, and the series converges inL2(Ω, H). Moreover, the series is P-a.s. uniformly convergent on[0, T] for arbitrary T...

work page

[1] [1]

Bahouri, J.-Y

H. Bahouri, J.-Y. Chemin, and R. Danchin. Fourier Analysis and Nonlinear Partial Differential Equations. Springer Berlin Heidelberg, 2011

work page 2011

[2] [2]

Chevyrev, A

I. Chevyrev, A. Gerasimovičs, and H. Weber. Feature engineering with regularity structures. Journal of Scientific Computing, 98(1):13, 2024

work page 2024

[3] [3]

Da Prato and J

G. Da Prato and J. Zabczyk. Stochastic equations in infinite dimensions, volume 152. Cambridge university press, 2014

work page 2014

[4] [4]

S. Gong, P. Hu, Q. Meng, Y. Wang, R. Zhu, B. Chen, Z. Ma, H. Ni, and T.-Y. Liu. Deep latent regularity network for modeling stochastic partial differential equations. InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 7740–7747, 2023

work page 2023

[5] [5]

Gubinelli and M

M. Gubinelli and M. Hofmanová. Global solutions to elliptic and parabolicΦ4 models in euclidean space. Communications in Mathematical Physics, 368(3):1201–1266, 2019

work page 2019

[6] [6]

Gubinelli, P

M. Gubinelli, P. Imkeller, and N. Perkowski. Paracontrolled distributions and singular PDEs. In Forum of Mathematics, Pi, volume 3, page e6. Cambridge University Press, 2015

work page 2015

[7] [7]

M. Hairer. Solving the KPZ equation.Annals of mathematics, pages 559–664, 2013

work page 2013

[8] [8]

M. Hairer. A theory of regularity structures.Inventiones mathematicae, 198(2):269–504, 2014

work page 2014

[9] [9]

Kidger, J

P. Kidger, J. Morrill, J. Foster, and T. Lyons. Neural controlled differential equations for irregular time series.Advances in neural information processing systems, 33:6696–6707, 2020

work page 2020

[10] [10]

J. R. Klauder. Stochastic quantization. In Recent developments in high-energy physics, pages 251–281. Springer, 1983

work page 1983

[11] [11]

Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anand- kumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[12] [12]

G. J. Lord, C. E. Powell, and T. Shardlow.An introduction to computational stochastic PDEs, volume 50. Cambridge University Press, 2014. 12

work page 2014

[13] [13]

L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature machine intelligence, 3(3):218–229, 2021

work page 2021

[14] [14]

Morrill, C

J. Morrill, C. Salvi, P. Kidger, and J. Foster. Neural rough differential equations for long time series. In International Conference on Machine Learning, pages 7829–7838. PMLR, 2021

work page 2021

[15] [15]

Neufeld and P

A. Neufeld and P. Schmocker. Solving stochastic partial differential equations using neural networks in the wiener chaos expansion.arXiv preprint arXiv:2411.03384, 2024

work page arXiv 2024

[16] [16]

Salvi, M

C. Salvi, M. Lemercier, and A. Gerasimovics. Neural stochastic PDEs: Resolution-invariant learning of continuous spatiotemporal dynamics.Advancesin Neural Information Processing Systems, 35:1333–1344, 2022

work page 2022

[17] [17]

SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs

H. Triebel.Theory of Function Spaces III, volume 100. Birkhäuser Basel, 2006. 13 This is theSupplementary Materials for "SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs". A Basic Concepts in SPDE Theory A.1 Space-time white noise In most cases, the noise termξ = ξ(t, x) is introduced to represent a so-called space-time ...

work page 2006

[18] [18]

W (0) = 0 almost surely

work page

[19] [19]

W (t; ω) is a continuous sample trajectoryR+ 7→ H, for eachω ∈ Ω

work page

[20] [20]

W (t) is Ft-adapted and W (t) − W (s) is independent ofFs for s < t

work page

[21] [21]

/Phi42+_expl_xi_eps_2_1200.parquet

W (t) − W (s) ∼ N (0, (t − s)Q) for all 0 ≤ s ≤ t. In analogy to the Karhunen–Loève expansion, it can be shown thatW (t) is a Q-Wiener process if and only if for allt ≥ 0, W (t) = ∞X j=1 p λjϕjβj(t), 14 where βj(t) are i.i.d. Brownian motions, and the series converges inL2(Ω, H). Moreover, the series is P-a.s. uniformly convergent on[0, T] for arbitrary T...

work page