pith. sign in

arxiv: 2505.18511 · v3 · pith:46O54PYVnew · submitted 2025-05-24 · 💻 cs.LG · math.AP· physics.comp-ph

SPDEBench: An Extensive Benchmark for Learning Stochastic PDEs

Pith reviewed 2026-05-22 01:42 UTC · model grok-4.3

classification 💻 cs.LG math.APphysics.comp-ph
keywords stochastic partial differential equationsmachine learning benchmarksoperator learningsingular SPDEssurrogate modelsspatio-temporal dynamicsmodel evaluationrenormalization
0
0 comments X

The pith

SPDE-aware architectures outperform generic operator learners on a new benchmark for stochastic PDEs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SPDEBench as the first unified benchmark supplying ready-to-use datasets for both regular and singular stochastic partial differential equations on one- to three-dimensional domains. It pairs these datasets with representative machine learning baselines and seven evaluation metrics that extend beyond standard L2 error to include Sobolev and distributional measures. Systematic tests under controlled variations in data generation show that models incorporating awareness of SPDE structure deliver stronger accuracy, robustness, and out-of-distribution generalization than generic operator-learning approaches. The benchmark addresses the lack of standardized resources for singular SPDEs, where noise approximations and renormalization affect dataset quality and model rankings. This setup creates a reproducible foundation for comparing and improving surrogate models of rough physical dynamics.

Core claim

SPDEBench supplies controlled datasets for regular and singular SPDEs along with baselines and metrics, and evaluations under controlled data variations demonstrate that SPDE-aware architectures generally achieve stronger performance than generic operator-learning baselines on accuracy, robustness, and out-of-distribution generalization.

What carries the argument

SPDEBench benchmark, which generates and standardizes datasets for regular and singular SPDEs using specific noise approximations, basis choices, and renormalization procedures, then evaluates models with Sobolev and distributional metrics.

Load-bearing premise

The chosen noise approximations, basis choices, and renormalization procedures produce datasets representative enough for reliable model ranking across the broader space of singular SPDEs.

What would settle it

A generic operator-learning model that matches or exceeds SPDE-aware models on accuracy and robustness across a wider collection of singular SPDEs generated with alternate noise approximations and renormalization schemes.

Figures

Figures reproduced from arXiv: 2505.18511 by Bingguang Chen, Dai Shi, Hao Ni, Jose Miguel Lara Rangel, Luke Thompson, Oliver Nash, Qi Meng, Rongchan Zhu, Siran Li, Yuantu Zhu, Zheyan Li.

Figure 1
Figure 1. Figure 1: Comparison of two data generation methods of [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of FNO and NSPDE trained on KdV datasets with noise generated by [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the time evolution of the NSE( [PITH_FULL_IMAGE:figures/full_fig_p019_3.png] view at source ↗
read the original abstract

Stochastic Partial Differential Equations (SPDEs) driven by random noise play a central role in modeling physical processes with rough spatio-temporal dynamics, such as turbulence flows, superconductors, and quantum dynamics. Although machine learning (ML)-based surrogate models have shown promise for efficiently approximating such dynamics, progress remains limited by the lack of a unified benchmark with controlled data generation and comprehensive evaluation. This gap is particularly significant for singular SPDEs, for which benchmark datasets are largely unavailable and reliable simulation requires numerically delicate schemes based on renormalization. Moreover, subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets and, consequently, model evaluation. We introduce SPDEBench, the first unified benchmark for ML-based SPDE learning. SPDEBench provides ready-to-use datasets for physically and mathematically significant SPDEs on 1-3D domains with periodic or Dirichlet boundary condition. Both regular and singular SPDEs are taken into consideration. SPDEBench also incorporates representative ML baselines in operator learning, together with 7 evaluation metrics, including Sobolev and distributional metrics beyond the standard $L^2$-error. Supported by SPDEBench, we conduct systematic evaluations of model accuracy, robustness, and out-of-distribution generalization under controlled data variations. Our numerical results show that SPDE-aware architectures generally achieve stronger performance than generic operator-learning baselines. These findings establish SPDEBench as a reproducible and extensible resource, paving pathway for principled benchmarking and architecture design for stochastic spatio-temporal dynamics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper introduces SPDEBench, the first unified benchmark for machine learning surrogate models of stochastic partial differential equations (SPDEs). It supplies ready-to-use datasets for both regular and singular SPDEs on 1-3D domains with periodic or Dirichlet boundary conditions, generated under controlled variations in noise approximation, basis choice, and renormalization. The benchmark incorporates representative operator-learning baselines and evaluates them with seven metrics that include Sobolev and distributional measures in addition to standard L2 error. Systematic experiments under these controlled variations lead to the finding that SPDE-aware architectures generally outperform generic baselines on accuracy, robustness, and out-of-distribution generalization.

Significance. If the reported rankings prove stable, SPDEBench would constitute a valuable, reproducible resource that fills a clear gap in standardized evaluation for stochastic spatio-temporal modeling, particularly for singular SPDEs whose simulation requires delicate renormalization. The provision of controlled data variations, multiple metrics beyond L2, and emphasis on extensibility are genuine strengths that could guide future architecture development in scientific machine learning.

major comments (1)
  1. Abstract: The manuscript explicitly states that 'subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets' for singular SPDEs, yet no sensitivity analysis is presented to verify that the reported performance advantages of SPDE-aware architectures remain stable under reasonable alternative choices of these procedures. This directly affects the reliability of the central comparative claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive evaluation of SPDEBench and for the constructive major comment. We address the concern regarding sensitivity analysis below.

read point-by-point responses
  1. Referee: Abstract: The manuscript explicitly states that 'subtle differences in data-generation procedures, such as noise approximation, basis choice, and the inclusion of renormalization, can significantly affect the resulting datasets' for singular SPDEs, yet no sensitivity analysis is presented to verify that the reported performance advantages of SPDE-aware architectures remain stable under reasonable alternative choices of these procedures. This directly affects the reliability of the central comparative claim.

    Authors: We agree that explicitly verifying the stability of performance rankings under alternative data-generation choices strengthens the reliability of the central claims. While the manuscript already reports systematic evaluations of accuracy, robustness, and out-of-distribution generalization under controlled variations in noise approximation, basis choice, and renormalization (detailed in Sections 3–5 and the associated figures), these experiments do not include a dedicated sensitivity study that perturbs the procedures beyond the primary controlled settings and re-evaluates the relative ordering of SPDE-aware versus generic architectures. In the revised manuscript we will add such a sensitivity analysis, consisting of additional experiments with reasonable alternative noise discretizations, basis expansions, and renormalization thresholds for the singular SPDEs. This will directly confirm (or qualify) the stability of the reported advantages. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical benchmark evaluation

full rationale

The paper introduces SPDEBench as an empirical benchmark with generated datasets and comparative model evaluations on accuracy, robustness, and OOD generalization. Its central claim rests on numerical results from these datasets rather than any mathematical derivation or prediction that reduces to fitted inputs or self-citations by construction. No load-bearing steps match the enumerated circularity patterns; the work is self-contained as a reproducible resource for external validation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The benchmark rests on standard numerical schemes for SPDEs (including renormalization for singular cases) and on the choice of seven evaluation metrics; no new physical axioms or invented entities are introduced.

axioms (1)
  • domain assumption Standard numerical schemes based on renormalization are required and sufficient for generating reliable singular SPDE datasets.
    Invoked when the abstract states that reliable simulation of singular SPDEs requires numerically delicate renormalization-based schemes.

pith-pipeline@v0.9.0 · 5840 in / 1236 out tokens · 26265 ms · 2026-05-22T01:42:37.271860+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 1 internal anchor

  1. [1]

    Bahouri, J.-Y

    H. Bahouri, J.-Y. Chemin, and R. Danchin. Fourier Analysis and Nonlinear Partial Differential Equations. Springer Berlin Heidelberg, 2011

  2. [2]

    Chevyrev, A

    I. Chevyrev, A. Gerasimovičs, and H. Weber. Feature engineering with regularity structures. Journal of Scientific Computing, 98(1):13, 2024

  3. [3]

    Da Prato and J

    G. Da Prato and J. Zabczyk. Stochastic equations in infinite dimensions, volume 152. Cambridge university press, 2014

  4. [4]

    S. Gong, P. Hu, Q. Meng, Y. Wang, R. Zhu, B. Chen, Z. Ma, H. Ni, and T.-Y. Liu. Deep latent regularity network for modeling stochastic partial differential equations. InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 7740–7747, 2023

  5. [5]

    Gubinelli and M

    M. Gubinelli and M. Hofmanová. Global solutions to elliptic and parabolicΦ4 models in euclidean space. Communications in Mathematical Physics, 368(3):1201–1266, 2019

  6. [6]

    Gubinelli, P

    M. Gubinelli, P. Imkeller, and N. Perkowski. Paracontrolled distributions and singular PDEs. In Forum of Mathematics, Pi, volume 3, page e6. Cambridge University Press, 2015

  7. [7]

    M. Hairer. Solving the KPZ equation.Annals of mathematics, pages 559–664, 2013

  8. [8]

    M. Hairer. A theory of regularity structures.Inventiones mathematicae, 198(2):269–504, 2014

  9. [9]

    Kidger, J

    P. Kidger, J. Morrill, J. Foster, and T. Lyons. Neural controlled differential equations for irregular time series.Advances in neural information processing systems, 33:6696–6707, 2020

  10. [10]

    J. R. Klauder. Stochastic quantization. In Recent developments in high-energy physics, pages 251–281. Springer, 1983

  11. [11]

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anand- kumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020

  12. [12]

    G. J. Lord, C. E. Powell, and T. Shardlow.An introduction to computational stochastic PDEs, volume 50. Cambridge University Press, 2014. 12

  13. [13]

    L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature machine intelligence, 3(3):218–229, 2021

  14. [14]

    Morrill, C

    J. Morrill, C. Salvi, P. Kidger, and J. Foster. Neural rough differential equations for long time series. In International Conference on Machine Learning, pages 7829–7838. PMLR, 2021

  15. [15]

    Neufeld and P

    A. Neufeld and P. Schmocker. Solving stochastic partial differential equations using neural networks in the wiener chaos expansion.arXiv preprint arXiv:2411.03384, 2024

  16. [16]

    Salvi, M

    C. Salvi, M. Lemercier, and A. Gerasimovics. Neural stochastic PDEs: Resolution-invariant learning of continuous spatiotemporal dynamics.Advancesin Neural Information Processing Systems, 35:1333–1344, 2022

  17. [17]

    SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs

    H. Triebel.Theory of Function Spaces III, volume 100. Birkhäuser Basel, 2006. 13 This is theSupplementary Materials for "SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs". A Basic Concepts in SPDE Theory A.1 Space-time white noise In most cases, the noise termξ = ξ(t, x) is introduced to represent a so-called space-time ...

  18. [18]

    W (0) = 0 almost surely

  19. [19]

    W (t; ω) is a continuous sample trajectoryR+ 7→ H, for eachω ∈ Ω

  20. [20]

    W (t) is Ft-adapted and W (t) − W (s) is independent ofFs for s < t

  21. [21]

    /Phi42+_expl_xi_eps_2_1200.parquet

    W (t) − W (s) ∼ N (0, (t − s)Q) for all 0 ≤ s ≤ t. In analogy to the Karhunen–Loève expansion, it can be shown thatW (t) is a Q-Wiener process if and only if for allt ≥ 0, W (t) = ∞X j=1 p λjϕjβj(t), 14 where βj(t) are i.i.d. Brownian motions, and the series converges inL2(Ω, H). Moreover, the series is P-a.s. uniformly convergent on[0, T] for arbitrary T...