From Spectral Methods to Sample Complexity Bounds for Fourier Neural Operators

Daniel Sanz-Alonso; Nathan Waniorek; Nisha Chandramoorthy

arxiv: 2607.00320 · v1 · pith:OND2NGONnew · submitted 2026-07-01 · 📊 stat.ML · cs.LG· cs.NA· math.NA

From Spectral Methods to Sample Complexity Bounds for Fourier Neural Operators

Nisha Chandramoorthy , Daniel Sanz-Alonso , Nathan Waniorek This is my paper

Pith reviewed 2026-07-02 00:51 UTC · model grok-4.3

classification 📊 stat.ML cs.LGcs.NAmath.NA

keywords Fourier neural operatorssample complexityspectral methodsdissipative evolution equationsoperator learningapproximation boundsNavier-Stokes equationAllen-Cahn equation

0 comments

The pith

Fourier neural operators achieve polynomial sample complexity for learning time-T solution operators of dissipative evolution equations defined via spectral methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes approximation and learning guarantees for Fourier neural operators applied to time-T solution operators of dissipative evolution equations. It introduces classes of such operators defined through spectral methods to derive FNO approximation bounds and polynomial sample complexity results that hold uniformly over broad families of equations. For polynomial nonlinearities the rates depend primarily on input smoothness and physical domain dimension; for non-polynomial smooth nonlinearities they also incorporate nonlinearity smoothness and dissipation strength. The results cover equations including Navier-Stokes, Allen-Cahn, and Cahn-Hilliard, linking classical spectral approximation theory to operator learning.

Core claim

By defining classes of evolution operators through spectral methods, FNO approximation bounds and polynomial sample complexity guarantees follow for time-T solution operators of dissipative evolution equations. These hold uniformly over broad families rather than single fixed PDEs, with learning rates for polynomial nonlinearities depending primarily on the smoothness of the input space and the dimension of the physical domain, and for non-polynomial cases additionally on the smoothness of the nonlinear terms and the dissipation strength. The results apply in particular to the Navier-Stokes, Allen-Cahn, and Cahn-Hilliard equations.

What carries the argument

Classes of evolution operators defined through spectral methods, which link stable and accurate spectral discretizations to efficient FNO approximation and learning of the corresponding solution operators.

If this is right

Polynomial sample complexity holds uniformly across broad families of dissipative equations for polynomial nonlinearities, with rates set by input smoothness and domain dimension.
For equations with non-polynomial smooth nonlinearities the sample complexity remains polynomial but now also scales with nonlinearity smoothness and dissipation strength.
The same guarantees apply to the Navier-Stokes, Allen-Cahn, and Cahn-Hilliard equations as instances of the defined classes.
Classical spectral approximation theory directly yields the FNO bounds once the operator is placed in the spectral-method class.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The spectral-method classes could be used to certify FNO performance on other dissipative equations before training, by checking only their spectral discretization properties.
If the premise holds, replacing spectral discretizations with other stable schemes might produce analogous polynomial bounds for alternative neural operators.
The uniform-over-families result suggests that a single set of FNO hyperparameters could work across an entire family of equations without retuning per PDE.

Load-bearing premise

FNOs can efficiently approximate and learn solution operators whenever these operators admit stable and accurate spectral discretizations.

What would settle it

A direct computation showing that the number of samples needed to learn the Navier-Stokes solution operator to fixed accuracy grows super-polynomially in the input smoothness or domain dimension parameters would falsify the polynomial sample complexity claim.

read the original abstract

We establish approximation and learning guarantees for Fourier neural operators (FNOs) applied to time-$T$ solution operators of dissipative evolution equations. The analysis builds on the premise that FNOs can efficiently approximate and learn solution operators whenever these operators admit stable and accurate spectral discretizations. To formalize this idea, we introduce classes of evolution operators defined through spectral methods and derive FNO approximation bounds and polynomial sample complexity guarantees for these classes. For equations with polynomial nonlinearities, the learning rates depend primarily on the smoothness of the input space and the dimension of the physical domain. Our results hold uniformly over broad families of dissipative equations, rather than for a single fixed PDE, and apply in particular to the Navier--Stokes, Allen--Cahn, and Cahn--Hilliard equations. For equations with non-polynomial smooth nonlinearities, we prove that polynomial sample complexity still holds with rates that now additionally depend on the smoothness of the nonlinear terms and the dissipation strength. Overall, we connect classical spectral approximation theory with modern operator learning and explain when FNOs can learn nonlinear evolution operators efficiently.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper derives FNO sample complexity bounds uniform over dissipative PDE families via spectral methods, but the uniformity for non-polynomial nonlinearities looks vulnerable to varying dissipation strength.

read the letter

The main takeaway is that they formalize classes of evolution operators through spectral discretizations and then give FNO approximation and polynomial sample complexity results that are meant to apply uniformly across families of dissipative equations rather than one equation at a time. For polynomial nonlinearities the rates depend mainly on input smoothness and physical dimension, and they explicitly cover Navier-Stokes, Allen-Cahn, and Cahn-Hilliard. That connection between classical spectral theory and operator learning is the concrete new piece.

They do a reasonable job laying out the premise that stable spectral discretizations imply good FNO behavior, and the uniform-over-families framing is a useful organizational step if it holds. The abstract is clear that the polynomial case avoids extra dependence on dissipation.

The soft spot is the non-polynomial case. The abstract states that rates then also depend on the smoothness of the nonlinear terms and the dissipation strength. If the broad family is defined so dissipation can approach zero without a uniform lower bound, the hidden 1/dissipation factor can make the sample complexity non-uniform and non-polynomial for some members. That directly undercuts the uniformity claim while leaving the spectral premise untouched. The stress-test note captures this exactly, and nothing in the abstract contradicts it.

The work is aimed at researchers who want rigorous learning guarantees for scientific ML on fluids and materials. It shows clear engagement with the literature on spectral methods and operator learning, so it is worth a serious referee even if the non-polynomial uniformity needs tightening or clarification on how the families are bounded.

Referee Report

1 major / 0 minor

Summary. The paper claims to establish approximation and learning guarantees for Fourier neural operators (FNOs) applied to time-T solution operators of dissipative evolution equations. It introduces classes of evolution operators defined through spectral methods, derives FNO approximation bounds, and provides polynomial sample complexity guarantees. These hold uniformly over broad families of dissipative equations (rather than single fixed PDEs), with rates depending primarily on input smoothness and domain dimension for polynomial nonlinearities, and additionally on nonlinearity smoothness and dissipation strength for non-polynomial cases. The results apply in particular to the Navier-Stokes, Allen-Cahn, and Cahn-Hilliard equations, connecting spectral approximation theory to operator learning.

Significance. If the central derivations and uniformity claims hold, the work would be significant for providing explicit, polynomial sample complexity bounds for FNOs on broad classes of dissipative PDEs, rather than isolated examples. This strengthens the theoretical basis for operator learning in scientific ML by linking it directly to classical spectral methods, with potential implications for justifying FNO use on families of evolution equations.

major comments (1)

[Abstract] Abstract: The central claim that polynomial sample complexity 'hold[s] uniformly over broad families of dissipative equations' for non-polynomial nonlinearities is load-bearing for the paper's contribution, yet the rates 'additionally depend on ... the dissipation strength.' No uniform lower bound on dissipation is stated for the family, so the hidden 1/dissipation^α factor can make the complexity non-polynomial (and non-uniform) for members of the family with weak dissipation. This directly risks the uniformity guarantee while the spectral-discretization premise remains intact.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for identifying a potential source of ambiguity in the uniformity statement for non-polynomial nonlinearities. We address the comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that polynomial sample complexity 'hold[s] uniformly over broad families of dissipative equations' for non-polynomial nonlinearities is load-bearing for the paper's contribution, yet the rates 'additionally depend on ... the dissipation strength.' No uniform lower bound on dissipation is stated for the family, so the hidden 1/dissipation^α factor can make the complexity non-polynomial (and non-uniform) for members of the family with weak dissipation. This directly risks the uniformity guarantee while the spectral-discretization premise remains intact.

Authors: We agree that the abstract does not explicitly state a uniform lower bound on dissipation strength for the families considered in the non-polynomial case. The analysis in the body of the paper treats dissipation strength as a fixed parameter of each family (with the sample-complexity bound allowed to depend on it), and the uniformity is with respect to other parameters (smoothness, dimension, nonlinearity smoothness) within any family whose dissipation is bounded away from zero by a positive constant. This is consistent with the spectral-discretization premise, which naturally parameterizes families by such constants. However, the abstract's phrasing could be misread as claiming uniformity even when dissipation approaches zero. We will revise the abstract (and the corresponding statement in the introduction) to make the dependence on a uniform positive lower bound on dissipation explicit, thereby clarifying that the polynomial rates hold uniformly over each such family. revision: yes

Circularity Check

0 steps flagged

Derivation from spectral discretization classes to FNO bounds is self-contained with no reduction to inputs

full rationale

The paper introduces classes of evolution operators explicitly defined through spectral methods, then derives approximation bounds and sample complexity results for FNOs on those classes. This constitutes a direct theoretical mapping rather than a self-definitional loop, fitted prediction, or load-bearing self-citation chain. No equations or claims in the provided text reduce the central guarantees to tautologies or renamings of the inputs; the polynomial rates for polynomial nonlinearities and the additional dependence on dissipation for non-polynomial cases are presented as derived consequences of the spectral premise. The uniformity statement is a stated result, not a definitional artifact.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; full technical assumptions unavailable. The central premise is treated as a domain assumption.

axioms (1)

domain assumption FNOs can efficiently approximate and learn solution operators whenever these operators admit stable and accurate spectral discretizations
Stated explicitly as the premise on which the analysis builds.

pith-pipeline@v0.9.1-grok · 5735 in / 1134 out tokens · 32085 ms · 2026-07-02T00:51:23.721204+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

63 extracted references · 13 canonical work pages · 2 internal anchors

[1]

Adcock, N

B. Adcock, N. Dexter, and S. Moraga,Optimal approximation of infinite-dimensional holomorphic functions, Calcolo, 61 (2024), p. 12

2024
[2]

,Optimal approximation of infinite-dimensional holomorphic functions II: recovery from iid pointwise samples, Journal of Complexity, 89 (2025), p. 101933

2025
[3]

Adcock, M

B. Adcock, M. Griebel, and G. Maier,Learning Lipschitz operators with respect to Gaussian measures with near-optimal sample complexity, arXiv preprint arXiv:2410.23440, (2024)

work page arXiv 2024
[4]

Adrian, D

M. Adrian, D. Sanz-Alonso, and R. Willett,Data assimilation with machine learning surrogate models: A case study with FourCastNet, Artificial Intelligence for the Earth Systems, 4 (2025), p. e240050

2025
[5]

S. M. Allen and J. W. Cahn,A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening, Acta Metallurgica, 27 (1979), pp. 1085–1095

1979
[6]

Bonev, T

B. Bonev, T. Kurth, C. Hundt, J. Pathak, M. Baust, K. Kashinath, and A. Anandkumar, Spherical Fourier neural operators: Learning stable dynamics on the sphere, in International Conference on Machine Learning, PMLR, 2023, pp. 2806–2823

2023
[7]

Boullé, D

N. Boullé, D. Halikias, and A. Townsend,Elliptic PDE learning is provably data-efficient, Proceedings of the National Academy of Sciences, 120 (2023), p. e2303904120

2023
[8]

Boullé, S

N. Boullé, S. Kim, T. Shi, and A. Townsend,Learning Green’s functions associated with time-dependent partial differential equations, Journal of Machine Learning Research, 23 (2022), pp. 1–34

2022
[9]

Boullé and A

N. Boullé and A. Townsend,A mathematical guide to operator learning, in Handbook of Numerical Analysis, vol. 25, Elsevier, 2024, pp. 83–125

2024
[10]

Bourgain and D

J. Bourgain and D. Li,Strong ill-posedness of the incompressible Euler equation in borderline Sobolev spaces, Inventiones Mathematicae, 201 (2015), pp. 97–157

2015
[11]

Chen and D

J. Chen and D. Sanz-Alonso,Convergence rates for learning pseudo-differential operators, arXiv preprint arXiv:2601.04473, (2026)

work page arXiv 2026
[12]

,Optimal multiscale learning of linear operators, arXiv preprint arXiv:2606.16913, (2026)

work page arXiv 2026
[13]

K. Chen, M. Krishnan, and H. Yang,Error analysis for learning the time-stepping operator of evolutionary PDEs, arXiv preprint arXiv:2509.04256, (2025)

work page arXiv 2025
[14]

K. Chen, C. W ang, and H. Yang,Deep operator learning lessens the curse of dimensionality for PDEs, Transactions on Machine Learning Research, (2023)

2023
[15]

L. Q. Chen and J. Shen,Applications of semi-implicit Fourier-spectral method to phase field equations, Computer Physics Communications, 108 (1998), pp. 147–158. 27

1998
[16]

Cheng,Energy stable semi-implicit schemes for the 2D Allen–Cahn and fractional Cahn–Hilliard equations, IMA Journal of Numerical Analysis, (2025), p

X. Cheng,Energy stable semi-implicit schemes for the 2D Allen–Cahn and fractional Cahn–Hilliard equations, IMA Journal of Numerical Analysis, (2025), p. draf010

2025
[17]

D. S. Clark,Short proof of a discrete Gronwall inequality, Discrete Applied Mathematics, 16 (1987), pp. 279–281

1987
[18]

M. V. de Hoop, N. B. Kovachki, N. H. Nelsen, and A. M. Stuart,Convergence rates for learning linear operators from noisy data, SIAM/ASA Journal on Uncertainty Quantification, 11 (2023), pp. 480–513

2023
[19]

De Ryck and S

T. De Ryck and S. Mishra,Generic bounds on the approximation error for physics-informed (and) operator learning, Advances in Neural Information Processing Systems, 35 (2022), pp. 10945–10958

2022
[20]

Elbrächter, D

D. Elbrächter, D. Perekrestenko, P. Grohs, and H. Bölcskei,Deep neural network approximation theory, IEEE Transactions on Information Theory, 67 (2021), pp. 2581–2623

2021
[21]

C. M. Elliott and S. Luckhaus,A generalised diffusion equation for phase separation of a multi-component mixture with interfacial free energy, (1991)

1991
[22]

P. J. Flory,Thermodynamics of high polymer solutions, The Journal of chemical physics, 10 (1942), pp. 51–61

1942
[23]

Furuya, K

T. Furuya, K. Taniguchi, and S. Okuda,Quantitative approximation for neural operators in nonlinear parabolic equations, in The Thirteenth International Conference on Learning Representa- tions, 2025

2025
[24]

Gopakumar, S

V. Gopakumar, S. Pamela, L. Zanisi, Z. Li, A. Anandkumar, and M. Team,Fourier neural operator for plasma modelling, arXiv preprint arXiv:2302.06542, (2023)

work page arXiv 2023
[25]

Grohs and F

P. Grohs and F. Voigtlaender,Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces, Foundations of Computational Mathematics, 24 (2024), pp. 1085–1143. [26]B. Guo,Spectral Methods and Their Applications, World Scientific, 1998

2024
[26]

Y. He,Stability and error analysis for a spectral Galerkin method for the Navier-Stokes equations with H2 or H1 initial data, Numerical Methods for Partial Differential Equations: An International Journal, 21 (2005), pp. 875–904

2005
[27]

2097–2124

,The Euler implicit/explicit scheme for the 2D time-dependent Navier-Stokes equations with smooth or non-smooth initial data, Mathematics of Computation, 77 (2008), pp. 2097–2124

2008
[28]

He and Y

Y. He and Y. Liu,Stability and convergence of the spectral Galerkin method for the Cahn–Hilliard equation, Numerical Methods for Partial Differential Equations: An International Journal, 24 (2008), pp. 1485–1500

2008
[29]

M. L. Huggins,Theory of solutions of high polymers, Journal of the American Chemical Society, 64 (1942), pp. 1712–1719

1942
[30]

Kim and M

T. Kim and M. Kang,Bounding the Rademacher complexity of Fourier neural operators, Machine Learning, 113 (2024), pp. 2467–2498

2024
[31]

Korolev,Two-layer neural networks with values in a Banach space, SIAM Journal on Mathe- matical Analysis, 54 (2022), pp

Y. Korolev,Two-layer neural networks with values in a Banach space, SIAM Journal on Mathe- matical Analysis, 54 (2022), pp. 6358–6389. 28

2022
[32]

Kotsuki, K

S. Kotsuki, K. Shiraishi, and A. Okazaki,Ensemble data assimilation to diagnose AI-based weather prediction models: a case with ClimaX version 0.3.1, Geoscientific Model Development, 18 (2025), pp. 7215–7225

2025
[33]

Kovachki, S

N. Kovachki, S. Lanthaler, and S. Mishra,On universal approximation and error bounds for Fourier neural operators, Journal of Machine Learning Research, 22 (2021), pp. 1–76

2021
[34]

N. B. Kovachki, S. Lanthaler, and H. Mhaskar,Data complexity estimates for operator learning, arXiv preprint arXiv:2405.15992, (2024)

work page arXiv 2024
[35]

N. B. Kovachki, S. Lanthaler, and A. M. Stuart,Operator learning: Algorithms and analysis, Handbook of Numerical Analysis, 25 (2024), pp. 419–467

2024
[36]

Lanthaler and N

S. Lanthaler and N. H. Nelsen,Error bounds for learning with vector-valued random features, Advances in Neural Information Processing Systems, 36 (2023), pp. 71834–71861

2023
[37]

Lanthaler, A

S. Lanthaler, A. M. Stuart, and M. Trautner,Discretization error of Fourier neural operators, arXiv preprint arXiv:2405.02221, (2024)

work page arXiv 2024
[38]

D. Li, Z. Qiao, and T. Tang,Characterizing the stabilization size for semi-implicit Fourier-spectral method to phase field equations, SIAM Journal on Numerical Analysis, 54 (2016), pp. 1653–1681

2016
[39]

D. Li, C. Quan, and T. Tang,Stability and convergence analysis for the implicit-explicit method to the Cahn–Hilliard equation, Mathematics of Computation, 91 (2022), pp. 785–809

2022
[40]

Li and T

D. Li and T. Tang,Stability of the semi-implicit method for the cahn-hilliard equation with logarithmic potentials, Annals of Applied Mathematics, 37 (2021), pp. 31–60

2021
[41]

Continuous Data Assimilation with Learned Surrogate Dynamics

W. Li and D. Sanz-Alonso,Continuous data assimilation with learned surrogate dynamics, arXiv preprint arXiv:2606.00480, (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[42]

Z. Li, D. Z. Huang, B. Liu, and A. Anandkumar,Fourier neural operator with learned deformations for PDEs on general geometries, Journal of Machine Learning Research, 24 (2023), pp. 1–26

2023
[43]

Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. M. Stuart, and A. Anandkumar,Fourier neural operator for parametric partial differential equations, in 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, OpenReview.net, 2021

2021
[44]

H. Liu, H. Yang, M. Chen, T. Zhao, and W. Liao,Deep nonparametric estimation of operators between infinite dimensional spaces, Journal of Machine Learning Research, 25 (2024), pp. 1–67

2024
[45]

N. Liu, S. Jafarzadeh, and Y. Yu,Domain agnostic Fourier neural operators, Advances in Neural Information Processing Systems, 36 (2023), pp. 47438–47450

2023
[46]

N. H. Nelsen and Y. Yang,Operator learning meets inverse problems: A probabilistic perspective, arXiv preprint arXiv:2508.20207, (2025)

work page arXiv 2025
[47]

FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators

J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, T. Kurth, D. Hall, Z. Li, K. Azizzadenesheli, et al.,FourCastNet: A global data-driven high- resolution weather model using adaptive Fourier neural operators, arXiv preprint arXiv:2202.11214, (2022). 29

work page internal anchor Pith review Pith/arXiv arXiv 2022
[48]

J. C. Robinson,Infinite-Dimensional Dynamical Systems: An Introduction to Dissipative Parabolic PDEs and the Theory of Global Attractors, vol. 28, Cambridge University Press, 2001

2001
[49]

Sanz-Alonso and N

D. Sanz-Alonso and N. W aniorek,Long-time accuracy of ensemble Kalman filters for chaotic dynamical systems and machine-learned dynamical systems, SIAM Journal on Applied Dynamical Systems, 24 (2025), pp. 2246–2286

2025
[50]

E. M. Stein,Singular Integrals and Differentiability Properties of Functions, no. 30, Princeton University Press, 1970

1970
[51]

Subedi and A

U. Subedi and A. Tewari,Controlling statistical, discretization, and truncation errors in learning Fourier linear operators, arXiv preprint arXiv:2408.09004, (2024)

work page arXiv 2024
[52]

,Operator learning: A statistical perspective, arXiv preprint arXiv:2504.03503, (2025)

work page arXiv 2025
[53]

Temam,Infinite-Dimensional Dynamical Systems in Mechanics and Physics, vol

R. Temam,Infinite-Dimensional Dynamical Systems in Mechanics and Physics, vol. 68, Springer Science & Business Media, 2012

2012
[54]

M. J. W ainwright,High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48, Cambridge University Press, 2019

2019
[55]

W ang and A

C. W ang and A. Townsend,Operator learning for hyperbolic partial differential equations, arXiv preprint arXiv:2312.17489, (2023)

work page arXiv 2023
[56]

G. Wen, Z. Li, K. Azizzadenesheli, A. Anandkumar, and S. M. Benson,U-FNO—An enhanced Fourier neural operator-based deep-learning model for multiphase flow, Advances in Water Resources, 163 (2022), p. 104180

2022
[57]

Yarotsky,Error bounds for approximations with deep ReLU networks, Neural Networks, 94 (2017), pp

D. Yarotsky,Error bounds for approximations with deep ReLU networks, Neural Networks, 94 (2017), pp. 103–114

2017
[58]

de-aliasing

T. Zhou, X. W an, D. Z. Huang, Z. Li, Z. Peng, A. Anandkumar, J. F. Brady, P. W. Sternberg, and C. Daraio,AI-aided geometric design of anti-infection catheters, Science Advances, 10 (2024), p. eadj1741. A Background on Multilayer Perceptrons and FNOs This appendix overviews existing approximation results for multilayer perceptrons and FNOs. A.1 Multilayer...

2024
[59]

This, combined with Lemma B.5 yields |ℰ|≤1 2𝑟𝑁‖𝑒𝑗−1‖2 𝐿2 + 1 2‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2 ⃦⃦⃦𝒜−1/2 (︁ Ψ𝒢(𝑢𝑗−1)−𝒫𝑁𝒟𝑠 (︀ 𝒢(𝑢𝑗−1) )︀ −¯𝐸(𝑢𝑗−1) )︁⃦⃦⃦ 2 𝐿2 + 1 2‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2𝑟𝑁‖𝑒𝑗−1‖2 𝐿2 + 1 2‖𝑒𝑗‖2 𝐿2 + 1 2‖¯𝐸(𝑢𝑗−1)‖2 𝐿2 + 1 2‖𝑒𝑗‖2 𝐿2, where we have defined 𝑟𝑁=𝑟𝐶2 𝒟𝑐−1𝑁𝑑(𝑝−1)(𝑈′)2𝑝−2. 52 Assuming that𝑈′is taken to be large enough such that‖𝑢𝑗‖𝐿2≤𝑈′(we will specify...
[60]

This, combined with the Lipschitz bound encoded in the definition of the nonlinearity classG(𝛼,𝐶𝒢), yields that |ℰ|≤(𝑑˜𝑑𝐶𝒟𝐶𝒢)2 2𝑐 ‖𝑒𝑗−1‖2 𝐿2 +‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2 ⃦⃦⃦𝒜−1/2 (︁ 𝒫𝑁𝒟𝑠 (︀ 𝒢(𝑢𝑗−1) )︀ −Ψ𝒢(𝑢𝑗−1) )︁⃦⃦⃦ 2 𝐿2 . Assuming that𝑈′is chosen to be large enough such that‖𝑢𝑗‖𝐿2≤𝑈′(we will specify a specific choice of𝑈′shortly below), Lemma B.6 yields |ℰ|≤(𝑑˜𝑑...
[61]

For the free energy functional ℰ(𝑢) = ∫︁ T2 (︁𝜈 2|∇𝑢|2−𝜃𝑐 2𝑢2 +𝒲(𝑢) )︁ 𝑑𝑥,(D.5) the iterates are energy stable: for all𝑗≥0, ℰ(𝑢𝑗+1 𝑁 )≤ℰ(𝑢𝑗 𝑁).(D.6)
[62]

The iterates𝑢𝑗 𝑁are well defined for all𝑗≥1and there exists𝑉(𝑈,𝛿0,𝜈,𝜃,𝜃𝑐)such that sup 𝑗≥1 ‖𝑢𝑗 𝑁‖𝐻5≤𝑉.(D.7) 63
[63]

There exists𝛿1(𝑈,𝛿0,𝜈,𝜃,𝜃𝑐)∈(0,1)such that sup 𝑗≥1 ‖𝑢𝑗 𝑁‖𝐿∞≤1−𝛿1. Proof. This theorem is an analogue of the stability result [41, Theorem 1.1] for the scheme(4.8). Note that [41, Theorem 1.1] holds for the scheme 𝑢𝑗+1−𝑢𝑗 𝜏 =−𝜈Δ2𝑢𝑗+1−𝜃𝑐Δ𝑢𝑗+1+ Δ (︁ ˜𝑤(𝑢𝑗 𝑁) )︁ , which, in contrast to(4.8), is spatially continuous and treats the𝜃𝑐Δ𝑢term implicitly. As explai...

[1] [1]

Adcock, N

B. Adcock, N. Dexter, and S. Moraga,Optimal approximation of infinite-dimensional holomorphic functions, Calcolo, 61 (2024), p. 12

2024

[2] [2]

,Optimal approximation of infinite-dimensional holomorphic functions II: recovery from iid pointwise samples, Journal of Complexity, 89 (2025), p. 101933

2025

[3] [3]

Adcock, M

B. Adcock, M. Griebel, and G. Maier,Learning Lipschitz operators with respect to Gaussian measures with near-optimal sample complexity, arXiv preprint arXiv:2410.23440, (2024)

work page arXiv 2024

[4] [4]

Adrian, D

M. Adrian, D. Sanz-Alonso, and R. Willett,Data assimilation with machine learning surrogate models: A case study with FourCastNet, Artificial Intelligence for the Earth Systems, 4 (2025), p. e240050

2025

[5] [5]

S. M. Allen and J. W. Cahn,A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening, Acta Metallurgica, 27 (1979), pp. 1085–1095

1979

[6] [6]

Bonev, T

B. Bonev, T. Kurth, C. Hundt, J. Pathak, M. Baust, K. Kashinath, and A. Anandkumar, Spherical Fourier neural operators: Learning stable dynamics on the sphere, in International Conference on Machine Learning, PMLR, 2023, pp. 2806–2823

2023

[7] [7]

Boullé, D

N. Boullé, D. Halikias, and A. Townsend,Elliptic PDE learning is provably data-efficient, Proceedings of the National Academy of Sciences, 120 (2023), p. e2303904120

2023

[8] [8]

Boullé, S

N. Boullé, S. Kim, T. Shi, and A. Townsend,Learning Green’s functions associated with time-dependent partial differential equations, Journal of Machine Learning Research, 23 (2022), pp. 1–34

2022

[9] [9]

Boullé and A

N. Boullé and A. Townsend,A mathematical guide to operator learning, in Handbook of Numerical Analysis, vol. 25, Elsevier, 2024, pp. 83–125

2024

[10] [10]

Bourgain and D

J. Bourgain and D. Li,Strong ill-posedness of the incompressible Euler equation in borderline Sobolev spaces, Inventiones Mathematicae, 201 (2015), pp. 97–157

2015

[11] [11]

Chen and D

J. Chen and D. Sanz-Alonso,Convergence rates for learning pseudo-differential operators, arXiv preprint arXiv:2601.04473, (2026)

work page arXiv 2026

[12] [12]

,Optimal multiscale learning of linear operators, arXiv preprint arXiv:2606.16913, (2026)

work page arXiv 2026

[13] [13]

K. Chen, M. Krishnan, and H. Yang,Error analysis for learning the time-stepping operator of evolutionary PDEs, arXiv preprint arXiv:2509.04256, (2025)

work page arXiv 2025

[14] [14]

K. Chen, C. W ang, and H. Yang,Deep operator learning lessens the curse of dimensionality for PDEs, Transactions on Machine Learning Research, (2023)

2023

[15] [15]

L. Q. Chen and J. Shen,Applications of semi-implicit Fourier-spectral method to phase field equations, Computer Physics Communications, 108 (1998), pp. 147–158. 27

1998

[16] [16]

Cheng,Energy stable semi-implicit schemes for the 2D Allen–Cahn and fractional Cahn–Hilliard equations, IMA Journal of Numerical Analysis, (2025), p

X. Cheng,Energy stable semi-implicit schemes for the 2D Allen–Cahn and fractional Cahn–Hilliard equations, IMA Journal of Numerical Analysis, (2025), p. draf010

2025

[17] [17]

D. S. Clark,Short proof of a discrete Gronwall inequality, Discrete Applied Mathematics, 16 (1987), pp. 279–281

1987

[18] [18]

M. V. de Hoop, N. B. Kovachki, N. H. Nelsen, and A. M. Stuart,Convergence rates for learning linear operators from noisy data, SIAM/ASA Journal on Uncertainty Quantification, 11 (2023), pp. 480–513

2023

[19] [19]

De Ryck and S

T. De Ryck and S. Mishra,Generic bounds on the approximation error for physics-informed (and) operator learning, Advances in Neural Information Processing Systems, 35 (2022), pp. 10945–10958

2022

[20] [20]

Elbrächter, D

D. Elbrächter, D. Perekrestenko, P. Grohs, and H. Bölcskei,Deep neural network approximation theory, IEEE Transactions on Information Theory, 67 (2021), pp. 2581–2623

2021

[21] [21]

C. M. Elliott and S. Luckhaus,A generalised diffusion equation for phase separation of a multi-component mixture with interfacial free energy, (1991)

1991

[22] [22]

P. J. Flory,Thermodynamics of high polymer solutions, The Journal of chemical physics, 10 (1942), pp. 51–61

1942

[23] [23]

Furuya, K

T. Furuya, K. Taniguchi, and S. Okuda,Quantitative approximation for neural operators in nonlinear parabolic equations, in The Thirteenth International Conference on Learning Representa- tions, 2025

2025

[24] [24]

Gopakumar, S

V. Gopakumar, S. Pamela, L. Zanisi, Z. Li, A. Anandkumar, and M. Team,Fourier neural operator for plasma modelling, arXiv preprint arXiv:2302.06542, (2023)

work page arXiv 2023

[25] [25]

Grohs and F

P. Grohs and F. Voigtlaender,Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces, Foundations of Computational Mathematics, 24 (2024), pp. 1085–1143. [26]B. Guo,Spectral Methods and Their Applications, World Scientific, 1998

2024

[26] [26]

Y. He,Stability and error analysis for a spectral Galerkin method for the Navier-Stokes equations with H2 or H1 initial data, Numerical Methods for Partial Differential Equations: An International Journal, 21 (2005), pp. 875–904

2005

[27] [27]

2097–2124

,The Euler implicit/explicit scheme for the 2D time-dependent Navier-Stokes equations with smooth or non-smooth initial data, Mathematics of Computation, 77 (2008), pp. 2097–2124

2008

[28] [28]

He and Y

Y. He and Y. Liu,Stability and convergence of the spectral Galerkin method for the Cahn–Hilliard equation, Numerical Methods for Partial Differential Equations: An International Journal, 24 (2008), pp. 1485–1500

2008

[29] [29]

M. L. Huggins,Theory of solutions of high polymers, Journal of the American Chemical Society, 64 (1942), pp. 1712–1719

1942

[30] [30]

Kim and M

T. Kim and M. Kang,Bounding the Rademacher complexity of Fourier neural operators, Machine Learning, 113 (2024), pp. 2467–2498

2024

[31] [31]

Korolev,Two-layer neural networks with values in a Banach space, SIAM Journal on Mathe- matical Analysis, 54 (2022), pp

Y. Korolev,Two-layer neural networks with values in a Banach space, SIAM Journal on Mathe- matical Analysis, 54 (2022), pp. 6358–6389. 28

2022

[32] [32]

Kotsuki, K

S. Kotsuki, K. Shiraishi, and A. Okazaki,Ensemble data assimilation to diagnose AI-based weather prediction models: a case with ClimaX version 0.3.1, Geoscientific Model Development, 18 (2025), pp. 7215–7225

2025

[33] [33]

Kovachki, S

N. Kovachki, S. Lanthaler, and S. Mishra,On universal approximation and error bounds for Fourier neural operators, Journal of Machine Learning Research, 22 (2021), pp. 1–76

2021

[34] [34]

N. B. Kovachki, S. Lanthaler, and H. Mhaskar,Data complexity estimates for operator learning, arXiv preprint arXiv:2405.15992, (2024)

work page arXiv 2024

[35] [35]

N. B. Kovachki, S. Lanthaler, and A. M. Stuart,Operator learning: Algorithms and analysis, Handbook of Numerical Analysis, 25 (2024), pp. 419–467

2024

[36] [36]

Lanthaler and N

S. Lanthaler and N. H. Nelsen,Error bounds for learning with vector-valued random features, Advances in Neural Information Processing Systems, 36 (2023), pp. 71834–71861

2023

[37] [37]

Lanthaler, A

S. Lanthaler, A. M. Stuart, and M. Trautner,Discretization error of Fourier neural operators, arXiv preprint arXiv:2405.02221, (2024)

work page arXiv 2024

[38] [38]

D. Li, Z. Qiao, and T. Tang,Characterizing the stabilization size for semi-implicit Fourier-spectral method to phase field equations, SIAM Journal on Numerical Analysis, 54 (2016), pp. 1653–1681

2016

[39] [39]

D. Li, C. Quan, and T. Tang,Stability and convergence analysis for the implicit-explicit method to the Cahn–Hilliard equation, Mathematics of Computation, 91 (2022), pp. 785–809

2022

[40] [40]

Li and T

D. Li and T. Tang,Stability of the semi-implicit method for the cahn-hilliard equation with logarithmic potentials, Annals of Applied Mathematics, 37 (2021), pp. 31–60

2021

[41] [41]

Continuous Data Assimilation with Learned Surrogate Dynamics

W. Li and D. Sanz-Alonso,Continuous data assimilation with learned surrogate dynamics, arXiv preprint arXiv:2606.00480, (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[42] [42]

Z. Li, D. Z. Huang, B. Liu, and A. Anandkumar,Fourier neural operator with learned deformations for PDEs on general geometries, Journal of Machine Learning Research, 24 (2023), pp. 1–26

2023

[43] [43]

Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. M. Stuart, and A. Anandkumar,Fourier neural operator for parametric partial differential equations, in 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, OpenReview.net, 2021

2021

[44] [44]

H. Liu, H. Yang, M. Chen, T. Zhao, and W. Liao,Deep nonparametric estimation of operators between infinite dimensional spaces, Journal of Machine Learning Research, 25 (2024), pp. 1–67

2024

[45] [45]

N. Liu, S. Jafarzadeh, and Y. Yu,Domain agnostic Fourier neural operators, Advances in Neural Information Processing Systems, 36 (2023), pp. 47438–47450

2023

[46] [46]

N. H. Nelsen and Y. Yang,Operator learning meets inverse problems: A probabilistic perspective, arXiv preprint arXiv:2508.20207, (2025)

work page arXiv 2025

[47] [47]

FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators

J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, T. Kurth, D. Hall, Z. Li, K. Azizzadenesheli, et al.,FourCastNet: A global data-driven high- resolution weather model using adaptive Fourier neural operators, arXiv preprint arXiv:2202.11214, (2022). 29

work page internal anchor Pith review Pith/arXiv arXiv 2022

[48] [48]

J. C. Robinson,Infinite-Dimensional Dynamical Systems: An Introduction to Dissipative Parabolic PDEs and the Theory of Global Attractors, vol. 28, Cambridge University Press, 2001

2001

[49] [49]

Sanz-Alonso and N

D. Sanz-Alonso and N. W aniorek,Long-time accuracy of ensemble Kalman filters for chaotic dynamical systems and machine-learned dynamical systems, SIAM Journal on Applied Dynamical Systems, 24 (2025), pp. 2246–2286

2025

[50] [50]

E. M. Stein,Singular Integrals and Differentiability Properties of Functions, no. 30, Princeton University Press, 1970

1970

[51] [51]

Subedi and A

U. Subedi and A. Tewari,Controlling statistical, discretization, and truncation errors in learning Fourier linear operators, arXiv preprint arXiv:2408.09004, (2024)

work page arXiv 2024

[52] [52]

,Operator learning: A statistical perspective, arXiv preprint arXiv:2504.03503, (2025)

work page arXiv 2025

[53] [53]

Temam,Infinite-Dimensional Dynamical Systems in Mechanics and Physics, vol

R. Temam,Infinite-Dimensional Dynamical Systems in Mechanics and Physics, vol. 68, Springer Science & Business Media, 2012

2012

[54] [54]

M. J. W ainwright,High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48, Cambridge University Press, 2019

2019

[55] [55]

W ang and A

C. W ang and A. Townsend,Operator learning for hyperbolic partial differential equations, arXiv preprint arXiv:2312.17489, (2023)

work page arXiv 2023

[56] [56]

G. Wen, Z. Li, K. Azizzadenesheli, A. Anandkumar, and S. M. Benson,U-FNO—An enhanced Fourier neural operator-based deep-learning model for multiphase flow, Advances in Water Resources, 163 (2022), p. 104180

2022

[57] [57]

Yarotsky,Error bounds for approximations with deep ReLU networks, Neural Networks, 94 (2017), pp

D. Yarotsky,Error bounds for approximations with deep ReLU networks, Neural Networks, 94 (2017), pp. 103–114

2017

[58] [58]

de-aliasing

T. Zhou, X. W an, D. Z. Huang, Z. Li, Z. Peng, A. Anandkumar, J. F. Brady, P. W. Sternberg, and C. Daraio,AI-aided geometric design of anti-infection catheters, Science Advances, 10 (2024), p. eadj1741. A Background on Multilayer Perceptrons and FNOs This appendix overviews existing approximation results for multilayer perceptrons and FNOs. A.1 Multilayer...

2024

[59] [59]

This, combined with Lemma B.5 yields |ℰ|≤1 2𝑟𝑁‖𝑒𝑗−1‖2 𝐿2 + 1 2‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2 ⃦⃦⃦𝒜−1/2 (︁ Ψ𝒢(𝑢𝑗−1)−𝒫𝑁𝒟𝑠 (︀ 𝒢(𝑢𝑗−1) )︀ −¯𝐸(𝑢𝑗−1) )︁⃦⃦⃦ 2 𝐿2 + 1 2‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2𝑟𝑁‖𝑒𝑗−1‖2 𝐿2 + 1 2‖𝑒𝑗‖2 𝐿2 + 1 2‖¯𝐸(𝑢𝑗−1)‖2 𝐿2 + 1 2‖𝑒𝑗‖2 𝐿2, where we have defined 𝑟𝑁=𝑟𝐶2 𝒟𝑐−1𝑁𝑑(𝑝−1)(𝑈′)2𝑝−2. 52 Assuming that𝑈′is taken to be large enough such that‖𝑢𝑗‖𝐿2≤𝑈′(we will specify...

[60] [60]

This, combined with the Lipschitz bound encoded in the definition of the nonlinearity classG(𝛼,𝐶𝒢), yields that |ℰ|≤(𝑑˜𝑑𝐶𝒟𝐶𝒢)2 2𝑐 ‖𝑒𝑗−1‖2 𝐿2 +‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2 ⃦⃦⃦𝒜−1/2 (︁ 𝒫𝑁𝒟𝑠 (︀ 𝒢(𝑢𝑗−1) )︀ −Ψ𝒢(𝑢𝑗−1) )︁⃦⃦⃦ 2 𝐿2 . Assuming that𝑈′is chosen to be large enough such that‖𝑢𝑗‖𝐿2≤𝑈′(we will specify a specific choice of𝑈′shortly below), Lemma B.6 yields |ℰ|≤(𝑑˜𝑑...

[61] [61]

For the free energy functional ℰ(𝑢) = ∫︁ T2 (︁𝜈 2|∇𝑢|2−𝜃𝑐 2𝑢2 +𝒲(𝑢) )︁ 𝑑𝑥,(D.5) the iterates are energy stable: for all𝑗≥0, ℰ(𝑢𝑗+1 𝑁 )≤ℰ(𝑢𝑗 𝑁).(D.6)

[62] [62]

The iterates𝑢𝑗 𝑁are well defined for all𝑗≥1and there exists𝑉(𝑈,𝛿0,𝜈,𝜃,𝜃𝑐)such that sup 𝑗≥1 ‖𝑢𝑗 𝑁‖𝐻5≤𝑉.(D.7) 63

[63] [63]

There exists𝛿1(𝑈,𝛿0,𝜈,𝜃,𝜃𝑐)∈(0,1)such that sup 𝑗≥1 ‖𝑢𝑗 𝑁‖𝐿∞≤1−𝛿1. Proof. This theorem is an analogue of the stability result [41, Theorem 1.1] for the scheme(4.8). Note that [41, Theorem 1.1] holds for the scheme 𝑢𝑗+1−𝑢𝑗 𝜏 =−𝜈Δ2𝑢𝑗+1−𝜃𝑐Δ𝑢𝑗+1+ Δ (︁ ˜𝑤(𝑢𝑗 𝑁) )︁ , which, in contrast to(4.8), is spatially continuous and treats the𝜃𝑐Δ𝑢term implicitly. As explai...