Stoch-IDENT: New Method and Mathematical Analysis for Identifying SPDEs from Data
Pith reviewed 2026-05-18 20:40 UTC · model grok-4.3
The pith
Stoch-IDENT recovers both drift and diffusion coefficients of high-order linear and nonlinear SPDEs from trajectory data by solving a quadratic sparse regression problem.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper shows that SPDE identifiability from trajectory data follows from the spectral properties of the mean and covariance for linear cases with constant coefficients, generalizing deterministic PDE theory, and that the diffusion term can be recovered by casting the problem as sparse regression with quadratic measurements induced from drift residuals and feature covariances; the new Quadratic Subspace Pursuit algorithm solves this optimization and enjoys stable support recovery under certain conditions on the data.
What carries the argument
Quadratic Subspace Pursuit (QSP), a greedy algorithm that selects atoms to solve the non-convex, non-smooth sparse regression problem whose measurements are quadratic forms built from drift residuals and empirical covariances.
If this is right
- The method accommodates both additive and multiplicative noise structures in high-order SPDEs.
- QSP provides stable support recovery for the diffusion coefficients under conditions on the number and quality of trajectories.
- The sample-mean approach for the drift term extends deterministic PDE identification while accounting for stochastic effects.
- Numerical tests on various linear and nonlinear SPDEs confirm quantitative accuracy and qualitative fidelity of the recovered equations.
Where Pith is reading between the lines
- The same quadratic-measurement construction could be tested on inverse problems for stochastic ordinary differential equations or on data from real physical systems where only noisy trajectories are available.
- If the identifiability analysis for linear constant-coefficient cases extends to variable coefficients, the framework might apply to a broader class of time-inhomogeneous SPDEs.
- One could examine whether replacing the greedy QSP step with a convex relaxation changes the support-recovery guarantees or computational cost on the same quadratic measurements.
Load-bearing premise
The observed data must consist of sufficiently many independent trajectories so that sample means and empirical covariances converge to the true mean and covariance operators, allowing the quadratic measurements to isolate the diffusion coefficients without finite-sample bias.
What would settle it
Apply Stoch-IDENT to a large number of independent trajectories generated from a known linear SPDE whose drift and diffusion coefficients are given in advance; if the recovered diffusion support differs from the true support even though the number of trajectories is high and the theoretical conditions for QSP are met, the central recovery claim is false.
Figures
read the original abstract
In this paper, we propose Stoch-IDENT, a novel framework for identifying stochastic partial differential equations (SPDEs) from observational data. Our method can handle linear and nonlinear high-order SPDEs driven by time-dependent Wiener processes, accommodating both additive and multiplicative noise structures. To investigate the identifiability of SPDEs from trajectory data, we analyze the spectral properties of the solution's mean and covariance for linear SPDEs with constant coefficients, as well as the dimension of the solution space for parabolic and hyperbolic types, generalizing the identifiability theory for deterministic PDEs. Algorithmically, the drift term is identified via a sample-mean generalization of existing methods for PDE identification. For the diffusion term, we formulate a sparse regression problem with quadratic measurements induced from drift residuals and feature covariances. To address this challenging non-convex and non-smooth optimization, we develop a new greedy algorithm, Quadratic Subspace Pursuit (QSP), and prove that QSP enjoys stable support recovery under certain conditions. We validate Stoch-IDENT on various SPDEs, demonstrating its effectiveness through quantitative and qualitative evaluations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Stoch-IDENT, a framework for recovering both drift and diffusion terms of linear and nonlinear high-order SPDEs driven by time-dependent Wiener processes (additive or multiplicative noise) from trajectory data. Drift recovery generalizes deterministic sample-mean methods; diffusion is cast as a quadratic-measurement sparse regression solved by the new Quadratic Subspace Pursuit (QSP) algorithm, for which stable support recovery is proven under stated conditions. Identifiability is established via spectral properties of the mean and covariance operators together with solution-space dimension for linear constant-coefficient parabolic and hyperbolic SPDEs, generalizing deterministic PDE theory. Numerical experiments on several SPDEs are reported.
Significance. If the central claims hold, the work would advance data-driven discovery of SPDEs by supplying a unified algorithmic pipeline that handles both drift and diffusion, together with a new greedy solver (QSP) for quadratic measurements and a partial identifiability theory for the linear case. The explicit proof of stable support recovery for QSP and the generalization of deterministic identifiability arguments are concrete strengths that would be of interest to the numerical analysis and stochastic modeling communities.
major comments (3)
- [Identifiability analysis] Identifiability section: spectral and dimension arguments are supplied only for linear constant-coefficient SPDEs. No corresponding uniqueness, spectral, or solution-space analysis is given for the nonlinear SPDEs whose identification is claimed in the abstract and algorithmic sections; the central claim that the full framework identifies nonlinear SPDEs therefore rests on an unproven extension of the linear theory.
- [QSP algorithm and proof] QSP theorem: the conditions guaranteeing stable support recovery are referenced in the abstract but not restated or verified for the specific quadratic measurement operators constructed from drift residuals and empirical covariances in the SPDE examples; without this verification the recovery guarantee does not directly apply to the reported experiments.
- [Algorithmic framework] Drift-then-diffusion pipeline: the diffusion step uses residuals from the sample-mean drift estimate; no quantitative bound is provided on how finite-sample bias or variance in the drift step propagates into the quadratic measurements or the subsequent QSP recovery, which is load-bearing for the claimed robustness on finite trajectory data.
minor comments (2)
- [Notation and preliminaries] Notation for the quadratic measurement matrix and the precise definition of the feature library for nonlinear terms should be introduced earlier and used consistently across the algorithmic and theoretical sections.
- [Abstract] The abstract states that QSP enjoys stable support recovery 'under certain conditions'; these conditions should be summarized in one sentence in the abstract for immediate clarity.
Simulated Author's Rebuttal
We thank the referee for the careful reading of our manuscript and the constructive comments. We address each of the major comments below and indicate the revisions we plan to make.
read point-by-point responses
-
Referee: Identifiability section: spectral and dimension arguments are supplied only for linear constant-coefficient SPDEs. No corresponding uniqueness, spectral, or solution-space analysis is given for the nonlinear SPDEs whose identification is claimed in the abstract and algorithmic sections; the central claim that the full framework identifies nonlinear SPDEs therefore rests on an unproven extension of the linear theory.
Authors: We appreciate this observation. The identifiability analysis provided in the manuscript focuses on linear constant-coefficient SPDEs, extending the deterministic theory through spectral properties of the mean and covariance operators and the dimension of the solution space. For nonlinear SPDEs, the identification relies on the data-driven sparse regression framework, which is validated through numerical experiments on several nonlinear examples. To better align the claims with the provided analysis, we will revise the abstract and relevant sections to specify that the rigorous identifiability results apply to linear SPDEs, while the method is designed and tested for both linear and nonlinear cases. This revision will clarify the theoretical scope. revision: yes
-
Referee: QSP theorem: the conditions guaranteeing stable support recovery are referenced in the abstract but not restated or verified for the specific quadratic measurement operators constructed from drift residuals and empirical covariances in the SPDE examples; without this verification the recovery guarantee does not directly apply to the reported experiments.
Authors: The stable support recovery theorem for QSP is stated in general terms in the manuscript, applicable to quadratic measurement models satisfying the given conditions. In the context of SPDE identification, the quadratic measurements are derived from the residuals after drift estimation and the covariance structure of the features. We will include an additional discussion or appendix that verifies or discusses how the general conditions of the theorem are met for the operators used in the numerical experiments, thereby strengthening the connection between the theory and the reported results. revision: yes
-
Referee: Drift-then-diffusion pipeline: the diffusion step uses residuals from the sample-mean drift estimate; no quantitative bound is provided on how finite-sample bias or variance in the drift step propagates into the quadratic measurements or the subsequent QSP recovery, which is load-bearing for the claimed robustness on finite trajectory data.
Authors: We agree that a quantitative analysis of error propagation from the drift estimation to the diffusion recovery would provide stronger theoretical support for the robustness claims. Currently, the manuscript demonstrates robustness through numerical experiments with varying numbers of trajectories and noise intensities. Providing a full perturbation bound is a non-trivial extension that would involve analyzing the sensitivity of the quadratic measurements and the QSP algorithm to perturbations in the residuals. We will add a paragraph in the discussion section to acknowledge this limitation and to outline the conditions under which the method is expected to perform reliably based on the empirical evidence. revision: partial
Circularity Check
No circularity detected; derivation chain remains self-contained
full rationale
The paper grounds its identifiability claims for linear constant-coefficient SPDEs in an analysis of spectral properties of the mean and covariance operators together with solution-space dimension, explicitly generalizing prior deterministic PDE results rather than deriving them from the present method's outputs. The algorithmic pipeline separates drift recovery (via sample-mean extension of existing deterministic techniques) from diffusion recovery (via quadratic measurements on residuals and covariances, solved by the independently analyzed QSP algorithm whose support-recovery guarantee is stated under explicit conditions). No step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or definitional tautology; the nonlinear extension is presented as an algorithmic application without asserting that the linear spectral theory forces it by construction. The overall framework therefore contains independent mathematical content and does not collapse to its inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The solution process admits well-defined mean and covariance operators whose spectral properties determine identifiability for linear constant-coefficient SPDEs.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we analyze the spectral properties of the solution's mean and covariance for linear SPDEs with constant coefficients... generalizing the identifiability theory for deterministic PDEs
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
For the diffusion term, we formulate a sparse regression problem with quadratic measurements... develop a new greedy algorithm, Quadratic Subspace Pursuit (QSP)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
D. C. Antonopoulou, D. Farazakis, and G. Karali , Malliavin calculus for the stochastic Cahn-Hilliard/Allen-Cahn equation with unbounded noise diffusion , J. Differential Equations, 265 (2018), pp. 3168–3211
work page 2018
- [2]
- [3]
-
[4]
A. S. Bandeira, E. Dobriban, D. G. Mixon, and W. F. Sawin , Certifying the restricted isometry property is hard , IEEE transactions on information theory, 59 (2013), pp. 3448–3450
work page 2013
-
[5]
A. Beck and Y. C. Eldar , Sparsity constrained nonlinear optimization: Optimality conditions and algorithms, SIAM Journal on Optimization, 23 (2013), pp. 1480–1509
work page 2013
-
[6]
J. Berg and K. Nystr ¨om, Data-driven discovery of PDEs in complex datasets , Journal of Computational Physics, 384 (2019), pp. 239–252
work page 2019
-
[7]
D. Bl ¨omker and A. Jentzen , Galerkin approximations for the stochastic Burgers equation , SIAM J. Numer. Anal., 51 (2013), pp. 694–715
work page 2013
-
[8]
L. Boninsegna, F. N ¨uske, and C. Clementi , Sparse learning of stochastic dynamical equa- tions, The Journal of chemical physics, 148 (2018)
work page 2018
-
[9]
C.-E. Br´ehier, J. Cui, and J. Hong, Strong convergence rates of semidiscrete splitting approx- imations for the stochastic Allen-Cahn equation , IMA J. Numer. Anal., 39 (2019), pp. 2096–2134
work page 2019
-
[10]
S. L. Brunton, J. L. Proctor, and J. N. Kutz , Discovering governing equations from data by sparse identification of nonlinear dynamical systems , Proceedings of the national academy of sciences, 113 (2016), pp. 3932–3937
work page 2016
-
[11]
E. J. Candes and T. Tao, Decoding by linear programming, IEEE transactions on information theory, 51 (2005), pp. 4203–4215
work page 2005
-
[12]
J. Chen, M. K. Ng, and Z. Liu , Solving quadratic systems with full-rank matrices using sparse or generative priors , IEEE Transactions on Signal Processing, (2025)
work page 2025
- [13]
-
[14]
J. Cui, J. Hong, and Z. Liu , Strong convergence rate of finite difference approximations for stochastic cubic Schr¨ odinger equations, J. Differential Equations, 263 (2017), pp. 3687–3713
work page 2017
- [15]
-
[16]
G. Da Prato and J. Zabczyk, Stochastic equations in infinite dimensions , vol. 44 of Encyclo- pedia of Mathematics and its Applications, Cambridge University Press, Cambridge, 1992
work page 1992
-
[17]
R. D’agostino and E. S. Pearson , Tests for departure from normality. empirical results for the distributions of b2 and √b1, Biometrika, 60 (1973), pp. 613–622
work page 1973
- [18]
-
[19]
A. Debussche and J. Printems , Numerical simulation of the stochastic Korteweg-de Vries equation, Phys. D, 134 (1999), pp. 200–226
work page 1999
-
[20]
C. Dong, G. Xu, G. Han, B. J. Bethel, W. Xie, and S. Zhou , Recent developments in artificial intelligence in oceanography, Ocean-Land-Atmosphere Research, (2022)
work page 2022
-
[21]
D. L. Donoho and M. Elad , Optimally sparse representation in general (nonorthogonal) dic- tionaries via ℓ1 minimization, Proceedings of the National Academy of Sciences, 100 (2003), pp. 2197–2202
work page 2003
-
[22]
M. Du, Y. Chen, and D. Zhang , Discover: Deep identification of symbolically concise open- form partial differential equations via enhanced reinforcement learning, Physical Review Research, 6 (2024), p. 013182. 37
work page 2024
- [23]
-
[24]
J. Fan, L. Kong, L. Wang, and N. Xiu, Variable selection in sparse regression with quadratic measurements, Statistica Sinica, 28 (2018), pp. 1157–1178
work page 2018
-
[25]
F. Flandoli, M. Gubinelli, and E. Priola , Well-posedness of the transport equation by stochastic perturbation, Invent. Math., 180 (2010), pp. 1–53
work page 2010
-
[26]
B. Fornberg, Generation of finite difference formulas on arbitrarily spaced grids , Mathematics of computation, 51 (1988), pp. 699–706
work page 1988
-
[27]
A. Gerardos and P. Ronceray , Principled model selection for stochastic dynamics , arXiv preprint arXiv:2501.10339, (2025)
-
[28]
W. W. Hager and H. Zhang , A new conjugate gradient method with guaranteed descent and an efficient line search , SIAM Journal on optimization, 16 (2005), pp. 170–192
work page 2005
-
[29]
, A survey of nonlinear conjugate gradient methods, Pacific journal of Optimization, 2 (2006), pp. 35–58
work page 2006
-
[30]
Hairer, Solving the KPZ equation , Ann
M. Hairer, Solving the KPZ equation , Ann. of Math. (2), 178 (2013), pp. 559–664
work page 2013
- [31]
-
[32]
R. Y. He, H. Liu, and H. Liu , Group projected subspace pursuit for block sparse signal recon- struction: Convergence analysis and applications, Applied and Computational Harmonic Analysis, 75 (2025), p. 101726
work page 2025
- [33]
-
[34]
Y. He, H. Zhao, and Y. Zhong , How much can one learn a partial differential equation from its solution?, Found. Comput. Math., 24 (2024), pp. 1595–1641
work page 2024
-
[35]
D. J. Higham and P. E. Kloeden , An introduction to the numerical simulation of stochastic differential equations, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA,
-
[36]
J. Hong and L. Sun , Symplectic integration of stochastic Hamiltonian systems , vol. 2314 of Lecture Notes in Mathematics, Springer, Singapore, [2022] ©2022
work page 2022
-
[37]
S. H. Kang, W. Liao, and Y. Liu, Ident: Identifying differential equations with numerical time evolution, Journal of Scientific Computing, 87 (2021), pp. 1–27
work page 2021
-
[38]
Kunita, Stochastic flows and stochastic differential equations , vol
H. Kunita, Stochastic flows and stochastic differential equations , vol. 24 of Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge, 1990
work page 1990
- [39]
- [40]
-
[41]
Z. Long, Y. Lu, and B. Dong , PDE-Net 2.0: Learning PDEs from data with a numeric- symbolic hybrid deep network , Journal of Computational Physics, 399 (2019), p. 108925
work page 2019
-
[42]
M. L´opez-Fern´andez, C. Palencia, and A. Sch ¨adle, A spectral order method for inverting sectorial Laplace transforms, SIAM J. Numer. Anal., 44 (2006), pp. 1332–1350. 38
work page 2006
-
[43]
Y. C. Mathpati, T. Tripura, R. Nayek, and S. Chakraborty, Discovering stochastic par- tial differential equations from limited data using variational Bayes inference , Computer Methods in Applied Mechanics and Engineering, 418 (2024), p. 116512
work page 2024
-
[44]
D. A. Messenger and D. M. Bortz , Weak SINDy for partial differential equations , Journal of Computational Physics, 443 (2021), p. 110525
work page 2021
-
[45]
T. T. Nguyen, C. Soussen, J. Idier, and E.-H. Djermoune, NP-hardness of ℓ0 minimization problems: revision and extension to the non-negative setting , in 2019 13th International conference on Sampling Theory and Applications (SampTA), IEEE, 2019, pp. 1–4
work page 2019
-
[46]
Pazy, Semigroups of linear operators and applications to partial differential equations , vol
A. Pazy, Semigroups of linear operators and applications to partial differential equations , vol. 44 of Applied Mathematical Sciences, Springer-Verlag, New York, 1983
work page 1983
-
[47]
J. B. Reade, Eigenvalues of positive definite kernels. II, SIAM J. Math. Anal., 15 (1984), pp. 137– 142
work page 1984
-
[48]
S. H. Rudy, S. L. Brunton, J. L. Proctor, and J. N. Kutz , Data-driven discovery of partial differential equations, Science advances, 3 (2017), p. e1602614
work page 2017
-
[49]
G. Salinetti and R. J.-B. Wets , On the convergence of closed-valued measurable multifunc- tions, Transactions of the American Mathematical Society, 266 (1981), pp. 275–289
work page 1981
-
[50]
W. Song, S. Jiang, G. Camps-Valls, M. Williams, L. Zhang, M. Reichstein, H. Vereecken, L. He, X. Hu, and L. Shi , Towards data-driven discovery of governing equa- tions in geosciences, Communications Earth & Environment, 5 (2024), p. 589
work page 2024
-
[51]
R. Stephany and C. Earls , Weak-PDE-LEARN: A weak form based approach to discovering PDEs from noisy, limited data , Journal of Computational Physics, 506 (2024), p. 112950
work page 2024
-
[52]
S. A. Stouffer, E. A. Suchman, L. C. DeVinney, S. A. Star, and R. M. Williams Jr , The american soldier: Adjustment during army life. (studies in social psychology in world war ii), vol. 1, (1949)
work page 1949
- [53]
-
[54]
M. Tang, W. Liao, R. Kuske, and S. H. Kang, Weakident: Weak formulation for identifying differential equation using narrow-fit and trimming, Journal of Computational Physics, 483 (2023), p. 112069
work page 2023
- [55]
-
[56]
T. Tripura and S. Chakraborty, A sparse Bayesian framework for discovering interpretable nonlinear stochastic dynamical systems with gaussian white noise , Mechanical Systems and Signal Processing, 187 (2023), p. 109939
work page 2023
-
[57]
T. Tripura, S. Panda, B. Hazra, and S. Chakraborty , Data-driven discovery of inter- pretable Lagrangian of stochastically excited dynamical systems , Computer Methods in Applied Mechanics and Engineering, 427 (2024), p. 117032
work page 2024
-
[58]
S. Vogel, A stochastic approach to stability in stochastic programming, Journal of Computational and Applied Mathematics, 56 (1994), pp. 65–96
work page 1994
-
[59]
H. Xu, H. Chang, and D. Zhang , DLGA-PDE: Discovery of PDEs with incomplete candidate library via combination of deep learning and genetic algorithm , Journal of Computational Physics, 418 (2020), p. 109584
work page 2020
-
[60]
Z. Zhang and Y. Liu , A robust framework for identification of PDEs from noisy data , Journal of Computational Physics, 446 (2021), p. 110657. 39
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.