From Spectral Methods to Sample Complexity Bounds for Fourier Neural Operators
Pith reviewed 2026-07-02 00:51 UTC · model grok-4.3
The pith
Fourier neural operators achieve polynomial sample complexity for learning time-T solution operators of dissipative evolution equations defined via spectral methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By defining classes of evolution operators through spectral methods, FNO approximation bounds and polynomial sample complexity guarantees follow for time-T solution operators of dissipative evolution equations. These hold uniformly over broad families rather than single fixed PDEs, with learning rates for polynomial nonlinearities depending primarily on the smoothness of the input space and the dimension of the physical domain, and for non-polynomial cases additionally on the smoothness of the nonlinear terms and the dissipation strength. The results apply in particular to the Navier-Stokes, Allen-Cahn, and Cahn-Hilliard equations.
What carries the argument
Classes of evolution operators defined through spectral methods, which link stable and accurate spectral discretizations to efficient FNO approximation and learning of the corresponding solution operators.
If this is right
- Polynomial sample complexity holds uniformly across broad families of dissipative equations for polynomial nonlinearities, with rates set by input smoothness and domain dimension.
- For equations with non-polynomial smooth nonlinearities the sample complexity remains polynomial but now also scales with nonlinearity smoothness and dissipation strength.
- The same guarantees apply to the Navier-Stokes, Allen-Cahn, and Cahn-Hilliard equations as instances of the defined classes.
- Classical spectral approximation theory directly yields the FNO bounds once the operator is placed in the spectral-method class.
Where Pith is reading between the lines
- The spectral-method classes could be used to certify FNO performance on other dissipative equations before training, by checking only their spectral discretization properties.
- If the premise holds, replacing spectral discretizations with other stable schemes might produce analogous polynomial bounds for alternative neural operators.
- The uniform-over-families result suggests that a single set of FNO hyperparameters could work across an entire family of equations without retuning per PDE.
Load-bearing premise
FNOs can efficiently approximate and learn solution operators whenever these operators admit stable and accurate spectral discretizations.
What would settle it
A direct computation showing that the number of samples needed to learn the Navier-Stokes solution operator to fixed accuracy grows super-polynomially in the input smoothness or domain dimension parameters would falsify the polynomial sample complexity claim.
read the original abstract
We establish approximation and learning guarantees for Fourier neural operators (FNOs) applied to time-$T$ solution operators of dissipative evolution equations. The analysis builds on the premise that FNOs can efficiently approximate and learn solution operators whenever these operators admit stable and accurate spectral discretizations. To formalize this idea, we introduce classes of evolution operators defined through spectral methods and derive FNO approximation bounds and polynomial sample complexity guarantees for these classes. For equations with polynomial nonlinearities, the learning rates depend primarily on the smoothness of the input space and the dimension of the physical domain. Our results hold uniformly over broad families of dissipative equations, rather than for a single fixed PDE, and apply in particular to the Navier--Stokes, Allen--Cahn, and Cahn--Hilliard equations. For equations with non-polynomial smooth nonlinearities, we prove that polynomial sample complexity still holds with rates that now additionally depend on the smoothness of the nonlinear terms and the dissipation strength. Overall, we connect classical spectral approximation theory with modern operator learning and explain when FNOs can learn nonlinear evolution operators efficiently.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to establish approximation and learning guarantees for Fourier neural operators (FNOs) applied to time-T solution operators of dissipative evolution equations. It introduces classes of evolution operators defined through spectral methods, derives FNO approximation bounds, and provides polynomial sample complexity guarantees. These hold uniformly over broad families of dissipative equations (rather than single fixed PDEs), with rates depending primarily on input smoothness and domain dimension for polynomial nonlinearities, and additionally on nonlinearity smoothness and dissipation strength for non-polynomial cases. The results apply in particular to the Navier-Stokes, Allen-Cahn, and Cahn-Hilliard equations, connecting spectral approximation theory to operator learning.
Significance. If the central derivations and uniformity claims hold, the work would be significant for providing explicit, polynomial sample complexity bounds for FNOs on broad classes of dissipative PDEs, rather than isolated examples. This strengthens the theoretical basis for operator learning in scientific ML by linking it directly to classical spectral methods, with potential implications for justifying FNO use on families of evolution equations.
major comments (1)
- [Abstract] Abstract: The central claim that polynomial sample complexity 'hold[s] uniformly over broad families of dissipative equations' for non-polynomial nonlinearities is load-bearing for the paper's contribution, yet the rates 'additionally depend on ... the dissipation strength.' No uniform lower bound on dissipation is stated for the family, so the hidden 1/dissipation^α factor can make the complexity non-polynomial (and non-uniform) for members of the family with weak dissipation. This directly risks the uniformity guarantee while the spectral-discretization premise remains intact.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for identifying a potential source of ambiguity in the uniformity statement for non-polynomial nonlinearities. We address the comment below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that polynomial sample complexity 'hold[s] uniformly over broad families of dissipative equations' for non-polynomial nonlinearities is load-bearing for the paper's contribution, yet the rates 'additionally depend on ... the dissipation strength.' No uniform lower bound on dissipation is stated for the family, so the hidden 1/dissipation^α factor can make the complexity non-polynomial (and non-uniform) for members of the family with weak dissipation. This directly risks the uniformity guarantee while the spectral-discretization premise remains intact.
Authors: We agree that the abstract does not explicitly state a uniform lower bound on dissipation strength for the families considered in the non-polynomial case. The analysis in the body of the paper treats dissipation strength as a fixed parameter of each family (with the sample-complexity bound allowed to depend on it), and the uniformity is with respect to other parameters (smoothness, dimension, nonlinearity smoothness) within any family whose dissipation is bounded away from zero by a positive constant. This is consistent with the spectral-discretization premise, which naturally parameterizes families by such constants. However, the abstract's phrasing could be misread as claiming uniformity even when dissipation approaches zero. We will revise the abstract (and the corresponding statement in the introduction) to make the dependence on a uniform positive lower bound on dissipation explicit, thereby clarifying that the polynomial rates hold uniformly over each such family. revision: yes
Circularity Check
Derivation from spectral discretization classes to FNO bounds is self-contained with no reduction to inputs
full rationale
The paper introduces classes of evolution operators explicitly defined through spectral methods, then derives approximation bounds and sample complexity results for FNOs on those classes. This constitutes a direct theoretical mapping rather than a self-definitional loop, fitted prediction, or load-bearing self-citation chain. No equations or claims in the provided text reduce the central guarantees to tautologies or renamings of the inputs; the polynomial rates for polynomial nonlinearities and the additional dependence on dissipation for non-polynomial cases are presented as derived consequences of the spectral premise. The uniformity statement is a stated result, not a definitional artifact.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption FNOs can efficiently approximate and learn solution operators whenever these operators admit stable and accurate spectral discretizations
Reference graph
Works this paper leans on
-
[1]
Adcock, N
B. Adcock, N. Dexter, and S. Moraga,Optimal approximation of infinite-dimensional holomorphic functions, Calcolo, 61 (2024), p. 12
2024
-
[2]
,Optimal approximation of infinite-dimensional holomorphic functions II: recovery from iid pointwise samples, Journal of Complexity, 89 (2025), p. 101933
2025
- [3]
-
[4]
Adrian, D
M. Adrian, D. Sanz-Alonso, and R. Willett,Data assimilation with machine learning surrogate models: A case study with FourCastNet, Artificial Intelligence for the Earth Systems, 4 (2025), p. e240050
2025
-
[5]
S. M. Allen and J. W. Cahn,A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening, Acta Metallurgica, 27 (1979), pp. 1085–1095
1979
-
[6]
Bonev, T
B. Bonev, T. Kurth, C. Hundt, J. Pathak, M. Baust, K. Kashinath, and A. Anandkumar, Spherical Fourier neural operators: Learning stable dynamics on the sphere, in International Conference on Machine Learning, PMLR, 2023, pp. 2806–2823
2023
-
[7]
Boullé, D
N. Boullé, D. Halikias, and A. Townsend,Elliptic PDE learning is provably data-efficient, Proceedings of the National Academy of Sciences, 120 (2023), p. e2303904120
2023
-
[8]
Boullé, S
N. Boullé, S. Kim, T. Shi, and A. Townsend,Learning Green’s functions associated with time-dependent partial differential equations, Journal of Machine Learning Research, 23 (2022), pp. 1–34
2022
-
[9]
Boullé and A
N. Boullé and A. Townsend,A mathematical guide to operator learning, in Handbook of Numerical Analysis, vol. 25, Elsevier, 2024, pp. 83–125
2024
-
[10]
Bourgain and D
J. Bourgain and D. Li,Strong ill-posedness of the incompressible Euler equation in borderline Sobolev spaces, Inventiones Mathematicae, 201 (2015), pp. 97–157
2015
-
[11]
J. Chen and D. Sanz-Alonso,Convergence rates for learning pseudo-differential operators, arXiv preprint arXiv:2601.04473, (2026)
- [12]
- [13]
-
[14]
K. Chen, C. W ang, and H. Yang,Deep operator learning lessens the curse of dimensionality for PDEs, Transactions on Machine Learning Research, (2023)
2023
-
[15]
L. Q. Chen and J. Shen,Applications of semi-implicit Fourier-spectral method to phase field equations, Computer Physics Communications, 108 (1998), pp. 147–158. 27
1998
-
[16]
Cheng,Energy stable semi-implicit schemes for the 2D Allen–Cahn and fractional Cahn–Hilliard equations, IMA Journal of Numerical Analysis, (2025), p
X. Cheng,Energy stable semi-implicit schemes for the 2D Allen–Cahn and fractional Cahn–Hilliard equations, IMA Journal of Numerical Analysis, (2025), p. draf010
2025
-
[17]
D. S. Clark,Short proof of a discrete Gronwall inequality, Discrete Applied Mathematics, 16 (1987), pp. 279–281
1987
-
[18]
M. V. de Hoop, N. B. Kovachki, N. H. Nelsen, and A. M. Stuart,Convergence rates for learning linear operators from noisy data, SIAM/ASA Journal on Uncertainty Quantification, 11 (2023), pp. 480–513
2023
-
[19]
De Ryck and S
T. De Ryck and S. Mishra,Generic bounds on the approximation error for physics-informed (and) operator learning, Advances in Neural Information Processing Systems, 35 (2022), pp. 10945–10958
2022
-
[20]
Elbrächter, D
D. Elbrächter, D. Perekrestenko, P. Grohs, and H. Bölcskei,Deep neural network approximation theory, IEEE Transactions on Information Theory, 67 (2021), pp. 2581–2623
2021
-
[21]
C. M. Elliott and S. Luckhaus,A generalised diffusion equation for phase separation of a multi-component mixture with interfacial free energy, (1991)
1991
-
[22]
P. J. Flory,Thermodynamics of high polymer solutions, The Journal of chemical physics, 10 (1942), pp. 51–61
1942
-
[23]
Furuya, K
T. Furuya, K. Taniguchi, and S. Okuda,Quantitative approximation for neural operators in nonlinear parabolic equations, in The Thirteenth International Conference on Learning Representa- tions, 2025
2025
-
[24]
V. Gopakumar, S. Pamela, L. Zanisi, Z. Li, A. Anandkumar, and M. Team,Fourier neural operator for plasma modelling, arXiv preprint arXiv:2302.06542, (2023)
-
[25]
Grohs and F
P. Grohs and F. Voigtlaender,Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces, Foundations of Computational Mathematics, 24 (2024), pp. 1085–1143. [26]B. Guo,Spectral Methods and Their Applications, World Scientific, 1998
2024
-
[26]
Y. He,Stability and error analysis for a spectral Galerkin method for the Navier-Stokes equations with H2 or H1 initial data, Numerical Methods for Partial Differential Equations: An International Journal, 21 (2005), pp. 875–904
2005
-
[27]
2097–2124
,The Euler implicit/explicit scheme for the 2D time-dependent Navier-Stokes equations with smooth or non-smooth initial data, Mathematics of Computation, 77 (2008), pp. 2097–2124
2008
-
[28]
He and Y
Y. He and Y. Liu,Stability and convergence of the spectral Galerkin method for the Cahn–Hilliard equation, Numerical Methods for Partial Differential Equations: An International Journal, 24 (2008), pp. 1485–1500
2008
-
[29]
M. L. Huggins,Theory of solutions of high polymers, Journal of the American Chemical Society, 64 (1942), pp. 1712–1719
1942
-
[30]
Kim and M
T. Kim and M. Kang,Bounding the Rademacher complexity of Fourier neural operators, Machine Learning, 113 (2024), pp. 2467–2498
2024
-
[31]
Korolev,Two-layer neural networks with values in a Banach space, SIAM Journal on Mathe- matical Analysis, 54 (2022), pp
Y. Korolev,Two-layer neural networks with values in a Banach space, SIAM Journal on Mathe- matical Analysis, 54 (2022), pp. 6358–6389. 28
2022
-
[32]
Kotsuki, K
S. Kotsuki, K. Shiraishi, and A. Okazaki,Ensemble data assimilation to diagnose AI-based weather prediction models: a case with ClimaX version 0.3.1, Geoscientific Model Development, 18 (2025), pp. 7215–7225
2025
-
[33]
Kovachki, S
N. Kovachki, S. Lanthaler, and S. Mishra,On universal approximation and error bounds for Fourier neural operators, Journal of Machine Learning Research, 22 (2021), pp. 1–76
2021
- [34]
-
[35]
N. B. Kovachki, S. Lanthaler, and A. M. Stuart,Operator learning: Algorithms and analysis, Handbook of Numerical Analysis, 25 (2024), pp. 419–467
2024
-
[36]
Lanthaler and N
S. Lanthaler and N. H. Nelsen,Error bounds for learning with vector-valued random features, Advances in Neural Information Processing Systems, 36 (2023), pp. 71834–71861
2023
-
[37]
S. Lanthaler, A. M. Stuart, and M. Trautner,Discretization error of Fourier neural operators, arXiv preprint arXiv:2405.02221, (2024)
-
[38]
D. Li, Z. Qiao, and T. Tang,Characterizing the stabilization size for semi-implicit Fourier-spectral method to phase field equations, SIAM Journal on Numerical Analysis, 54 (2016), pp. 1653–1681
2016
-
[39]
D. Li, C. Quan, and T. Tang,Stability and convergence analysis for the implicit-explicit method to the Cahn–Hilliard equation, Mathematics of Computation, 91 (2022), pp. 785–809
2022
-
[40]
Li and T
D. Li and T. Tang,Stability of the semi-implicit method for the cahn-hilliard equation with logarithmic potentials, Annals of Applied Mathematics, 37 (2021), pp. 31–60
2021
-
[41]
Continuous Data Assimilation with Learned Surrogate Dynamics
W. Li and D. Sanz-Alonso,Continuous data assimilation with learned surrogate dynamics, arXiv preprint arXiv:2606.00480, (2026)
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[42]
Z. Li, D. Z. Huang, B. Liu, and A. Anandkumar,Fourier neural operator with learned deformations for PDEs on general geometries, Journal of Machine Learning Research, 24 (2023), pp. 1–26
2023
-
[43]
Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. M. Stuart, and A. Anandkumar,Fourier neural operator for parametric partial differential equations, in 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, OpenReview.net, 2021
2021
-
[44]
H. Liu, H. Yang, M. Chen, T. Zhao, and W. Liao,Deep nonparametric estimation of operators between infinite dimensional spaces, Journal of Machine Learning Research, 25 (2024), pp. 1–67
2024
-
[45]
N. Liu, S. Jafarzadeh, and Y. Yu,Domain agnostic Fourier neural operators, Advances in Neural Information Processing Systems, 36 (2023), pp. 47438–47450
2023
- [46]
-
[47]
J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, T. Kurth, D. Hall, Z. Li, K. Azizzadenesheli, et al.,FourCastNet: A global data-driven high- resolution weather model using adaptive Fourier neural operators, arXiv preprint arXiv:2202.11214, (2022). 29
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[48]
J. C. Robinson,Infinite-Dimensional Dynamical Systems: An Introduction to Dissipative Parabolic PDEs and the Theory of Global Attractors, vol. 28, Cambridge University Press, 2001
2001
-
[49]
Sanz-Alonso and N
D. Sanz-Alonso and N. W aniorek,Long-time accuracy of ensemble Kalman filters for chaotic dynamical systems and machine-learned dynamical systems, SIAM Journal on Applied Dynamical Systems, 24 (2025), pp. 2246–2286
2025
-
[50]
E. M. Stein,Singular Integrals and Differentiability Properties of Functions, no. 30, Princeton University Press, 1970
1970
-
[51]
U. Subedi and A. Tewari,Controlling statistical, discretization, and truncation errors in learning Fourier linear operators, arXiv preprint arXiv:2408.09004, (2024)
- [52]
-
[53]
Temam,Infinite-Dimensional Dynamical Systems in Mechanics and Physics, vol
R. Temam,Infinite-Dimensional Dynamical Systems in Mechanics and Physics, vol. 68, Springer Science & Business Media, 2012
2012
-
[54]
M. J. W ainwright,High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48, Cambridge University Press, 2019
2019
-
[55]
C. W ang and A. Townsend,Operator learning for hyperbolic partial differential equations, arXiv preprint arXiv:2312.17489, (2023)
-
[56]
G. Wen, Z. Li, K. Azizzadenesheli, A. Anandkumar, and S. M. Benson,U-FNO—An enhanced Fourier neural operator-based deep-learning model for multiphase flow, Advances in Water Resources, 163 (2022), p. 104180
2022
-
[57]
Yarotsky,Error bounds for approximations with deep ReLU networks, Neural Networks, 94 (2017), pp
D. Yarotsky,Error bounds for approximations with deep ReLU networks, Neural Networks, 94 (2017), pp. 103–114
2017
-
[58]
de-aliasing
T. Zhou, X. W an, D. Z. Huang, Z. Li, Z. Peng, A. Anandkumar, J. F. Brady, P. W. Sternberg, and C. Daraio,AI-aided geometric design of anti-infection catheters, Science Advances, 10 (2024), p. eadj1741. A Background on Multilayer Perceptrons and FNOs This appendix overviews existing approximation results for multilayer perceptrons and FNOs. A.1 Multilayer...
2024
-
[59]
This, combined with Lemma B.5 yields |ℰ|≤1 2𝑟𝑁‖𝑒𝑗−1‖2 𝐿2 + 1 2‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2 ⃦⃦⃦𝒜−1/2 (︁ Ψ𝒢(𝑢𝑗−1)−𝒫𝑁𝒟𝑠 (︀ 𝒢(𝑢𝑗−1) )︀ −¯𝐸(𝑢𝑗−1) )︁⃦⃦⃦ 2 𝐿2 + 1 2‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2𝑟𝑁‖𝑒𝑗−1‖2 𝐿2 + 1 2‖𝑒𝑗‖2 𝐿2 + 1 2‖¯𝐸(𝑢𝑗−1)‖2 𝐿2 + 1 2‖𝑒𝑗‖2 𝐿2, where we have defined 𝑟𝑁=𝑟𝐶2 𝒟𝑐−1𝑁𝑑(𝑝−1)(𝑈′)2𝑝−2. 52 Assuming that𝑈′is taken to be large enough such that‖𝑢𝑗‖𝐿2≤𝑈′(we will specify...
-
[60]
This, combined with the Lipschitz bound encoded in the definition of the nonlinearity classG(𝛼,𝐶𝒢), yields that |ℰ|≤(𝑑˜𝑑𝐶𝒟𝐶𝒢)2 2𝑐 ‖𝑒𝑗−1‖2 𝐿2 +‖𝒜1/2𝑒𝑗‖2 𝐿2 + 1 2 ⃦⃦⃦𝒜−1/2 (︁ 𝒫𝑁𝒟𝑠 (︀ 𝒢(𝑢𝑗−1) )︀ −Ψ𝒢(𝑢𝑗−1) )︁⃦⃦⃦ 2 𝐿2 . Assuming that𝑈′is chosen to be large enough such that‖𝑢𝑗‖𝐿2≤𝑈′(we will specify a specific choice of𝑈′shortly below), Lemma B.6 yields |ℰ|≤(𝑑˜𝑑...
-
[61]
For the free energy functional ℰ(𝑢) = ∫︁ T2 (︁𝜈 2|∇𝑢|2−𝜃𝑐 2𝑢2 +𝒲(𝑢) )︁ 𝑑𝑥,(D.5) the iterates are energy stable: for all𝑗≥0, ℰ(𝑢𝑗+1 𝑁 )≤ℰ(𝑢𝑗 𝑁).(D.6)
-
[62]
The iterates𝑢𝑗 𝑁are well defined for all𝑗≥1and there exists𝑉(𝑈,𝛿0,𝜈,𝜃,𝜃𝑐)such that sup 𝑗≥1 ‖𝑢𝑗 𝑁‖𝐻5≤𝑉.(D.7) 63
-
[63]
There exists𝛿1(𝑈,𝛿0,𝜈,𝜃,𝜃𝑐)∈(0,1)such that sup 𝑗≥1 ‖𝑢𝑗 𝑁‖𝐿∞≤1−𝛿1. Proof. This theorem is an analogue of the stability result [41, Theorem 1.1] for the scheme(4.8). Note that [41, Theorem 1.1] holds for the scheme 𝑢𝑗+1−𝑢𝑗 𝜏 =−𝜈Δ2𝑢𝑗+1−𝜃𝑐Δ𝑢𝑗+1+ Δ (︁ ˜𝑤(𝑢𝑗 𝑁) )︁ , which, in contrast to(4.8), is spatially continuous and treats the𝜃𝑐Δ𝑢term implicitly. As explai...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.