Efficient Approximation for Encoder--Decoder Neural Operators via Variation Spaces

Jia-Qi Yang; Lei Shi

arxiv: 2606.01244 · v1 · pith:JCVXCDDDnew · submitted 2026-05-31 · 📊 stat.ML · cs.LG· cs.NA· math.FA· math.NA· math.ST· stat.TH

Efficient Approximation for Encoder--Decoder Neural Operators via Variation Spaces

Jia-Qi Yang , Lei Shi This is my paper

Pith reviewed 2026-06-28 16:21 UTC · model grok-4.3

classification 📊 stat.ML cs.LGcs.NAmath.FAmath.NAmath.STstat.TH

keywords neural operatorsencoder-decoder networksvariation spaceapproximation boundsBochner normoperator learningfinite-width approximation

0 comments

The pith

For operators in the variation space, encoder-decoder two-layer networks achieve approximation error that decomposes into input and output encoding errors plus an N^{-1/2} term independent of encoding dimensions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines a variation space as an infinite-dimensional structural class for nonlinear operators using vector-valued measures placed directly on the input and output spaces. For any operator in this space, the error of encoder-decoder two-layer networks measured in the Bochner L^q norm splits into the input encoding error, the output encoding error, and a finite-width term of order N^{-1/2} whose multiplicative constant does not grow with the chosen encoding dimensions. When the encoding errors themselves decay polynomially with dimension, the overall approximation and learning rates become algebraic. A reader would care because the result supplies explicit guarantees for neural operator learning that cover a wider family of targets than the usual Lipschitz or Fréchet-differentiable classes.

Core claim

Operators belonging to the variation space admit approximation by encoder-decoder two-layer networks whose error in the Bochner L^q norm equals the sum of the input encoding error, the output encoding error, and a finite-width approximation term of order N^{-1/2} whose constant is independent of the input and output encoding dimensions. Polynomial decay of the encoding errors then produces algebraic approximation and learning rates. The bounds supply theoretical guarantees for efficient neural operator learning beyond general Lipschitz or Fréchet differentiable operator classes.

What carries the argument

The variation space, an infinite-dimensional structural class for nonlinear operators defined through vector-valued measures directly on the input and output spaces, which enables the decomposed error bound.

If this is right

When input and output encoding errors decay polynomially in the encoding dimensions, algebraic approximation and learning rates follow.
The finite-width approximation term of order N^{-1/2} holds with a constant independent of the input and output encoding dimensions.
The bounds extend theoretical guarantees to operator classes beyond general Lipschitz or Fréchet differentiable ones.
Encoder-decoder two-layer networks suffice to realize the stated rates without requiring width to scale with encoding dimension.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The three-way error split suggests that practical design should balance encoding accuracy against network width rather than increasing width alone.
Operators that fail to belong to the variation space may need deeper encoders, different activation choices, or alternative architectures to recover comparable rates.
In applications one could attempt to verify variation-space membership by checking whether the target operator admits a representation via a suitable vector-valued measure on the input-output spaces.
The independence from encoding dimension may carry over to other norms or to networks with more than two layers provided the variation-space structure is preserved.

Load-bearing premise

The target nonlinear operators belong to the variation space defined through vector-valued measures directly on the input and output spaces.

What would settle it

An explicit nonlinear operator shown to lie in the variation space whose approximation error by encoder-decoder two-layer networks either fails to decompose into the three stated terms or has a multiplicative constant that grows with the encoding dimensions.

read the original abstract

We study operator learning using encoder--decoder neural networks. Inspired by the function-space theory of neural networks, we introduce a variation space as an infinite-dimensional structural class for nonlinear operators. This space is defined through vector-valued measures directly on the input and output spaces. For operators in this space, we establish approximation bounds for encoder--decoder two-layer networks in the Bochner $L^q$ norm. The resulting error bound decomposes into the input encoding error, the output encoding error, and a finite-width approximation term of order $N^{-1/2}$, with a constant independent of the input and output encoding dimensions. When the input and output encoding errors decay polynomially in the encoding dimensions, these estimates yield algebraic approximation and learning rates. The results provide an theoretical guarantees for efficient neural operator learning beyond general Lipschitz or Fr\'echet differentiable operator classes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines a variation space via vector-valued measures and derives an error decomposition for encoder-decoder networks with an N^{-1/2} term whose constant is independent of encoding dimensions.

read the letter

The main point is that operators in this variation space get an approximation bound in the Bochner L^q norm that splits into input encoding error, output encoding error, and a finite-width term of order N^{-1/2} whose prefactor does not grow with the encoding dimensions. When the encoding errors decay polynomially, the whole thing yields algebraic rates.

What is new is the specific construction of the space through vector-valued measures on the input and output spaces, together with the claim that this structure produces the dimension-independent constant for the two-layer encoder-decoder case. That is a step past the general Lipschitz or Fréchet classes mentioned in the abstract.

The decomposition itself is presented cleanly and the independence result is stated directly. The paper also notes that the bounds are conditional on membership in the space, which keeps the logic straightforward.

The soft spot is that the entire result rests on operators actually living in this space, yet the abstract gives no examples of which common operators satisfy the condition or how restrictive the class is in practice. Without the proofs it is also impossible to check whether the independence holds without hidden dependence on the encoding maps. The lack of any numerical illustration is minor for a theory paper but leaves the practical reach unclear.

This is for readers working on approximation theory for neural operators and scientific machine learning. Someone who needs concrete rates for encoder-decoder architectures would find the decomposition useful. The paper shows clear thinking on its own terms and the central claim does not contain obvious internal contradictions, so it deserves a serious referee.

Referee Report

0 major / 1 minor

Summary. The paper introduces a 'variation space' for nonlinear operators, defined via vector-valued measures on the input and output spaces. For operators belonging to this space, it derives approximation bounds for encoder-decoder two-layer networks in the Bochner L^q norm. The error decomposes into an input encoding error, an output encoding error, and a finite-width term of order N^{-1/2} whose constant is independent of the encoding dimensions. When the encoding errors decay polynomially with dimension, the bounds imply algebraic approximation and learning rates. The results are positioned as providing theoretical guarantees for efficient neural operator learning that go beyond general Lipschitz or Fréchet-differentiable operator classes.

Significance. If the central decomposition and independence of the constant from encoding dimensions hold, the work supplies a new structural class (the variation space) under which encoder-decoder architectures achieve dimension-independent approximation rates. This is a concrete advance over existing operator-learning theory that typically requires stronger regularity assumptions or yields worse dependence on encoding dimensions. The explicit error decomposition and the polynomial-rate corollary are the load-bearing contributions.

minor comments (1)

Abstract, last sentence: 'an theoretical guarantees' is a grammatical error and should read 'theoretical guarantees'.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our manuscript, the recognition of its significance, and the recommendation for minor revision. The referee's description accurately reflects the introduction of variation spaces, the error decomposition into encoding and finite-width terms, and the resulting algebraic rates under polynomial encoding decay.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The paper defines the variation space externally via vector-valued measures on input/output spaces as a new structural class. Approximation bounds and error decomposition (input/output encoding errors plus N^{-1/2} term with dimension-independent constant) are derived conditionally for operators in this space. No self-citations, self-definitional reductions, fitted parameters called predictions, or ansatz smuggling appear; the claims rest on the independent space definition and standard neural approximation arguments without reducing to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the newly introduced variation space and the assumption that target operators lie inside it; no numerical free parameters are mentioned.

axioms (1)

domain assumption Nonlinear operators of interest belong to the variation space defined through vector-valued measures on input and output spaces.
This membership is required for the stated approximation bounds to apply.

invented entities (1)

variation space no independent evidence
purpose: Infinite-dimensional structural class for nonlinear operators enabling the approximation analysis
Newly defined class; no independent evidence supplied in abstract.

pith-pipeline@v0.9.1-grok · 5688 in / 1336 out tokens · 26666 ms · 2026-06-28T16:21:39.736827+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Generalization Guarantees for Multi-Input Neural Operator Learning in Sobolev Spaces
cs.LG 2026-06 unverdicted novelty 6.0

Derives explicit approximation and generalization rates for multi-input neural operators in Sobolev spaces that quantify each input's contribution to the error.

Reference graph

Works this paper leans on

43 extracted references · 8 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Adcock, M

Ben Adcock, Michael Griebel, and Gregor Maier. The sample complexity of learning Lipschitz operators with respect to Gaussian measures.arXiv preprint arXiv:2410.23440, 2024

work page arXiv 2024
[2]

Adcock, G

Ben Adcock, Gregor Maier, and Rahul Parhi. Towards sharp minimax risk bounds for operator learning.arXiv preprint arXiv:2512.17805, 2025

work page arXiv 2025
[3]

Springer, 2006

Fernando Albiac and Nigel J Kalton.Topics in Banach Space Theory. Springer, 2006. 11

2006
[4]

Neural operator: Graph kernel network for partial differ- ential equations

Anima Anandkumar, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Nikola Kovachki, Zongyi Li, Burigede Liu, and Andrew Stuart. Neural operator: Graph kernel network for partial differ- ential equations. InICLR 2020 workshop on integration of deep neural models and differential equations, 2020

2020
[5]

Breaking the curse of dimensionality with convex neural networks.Journal of Machine Learning Research, 18(19):1–53, 2017

Francis Bach. Breaking the curse of dimensionality with convex neural networks.Journal of Machine Learning Research, 18(19):1–53, 2017

2017
[6]

Universal approximation bounds for superpositions of a sigmoidal function

Andrew R Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information theory, 39(3):930–945, 2002

2002
[7]

Model reduction and neural networks for parametric PDEs.The SMAI journal of computational math- ematics, 7:121–157, 2021

Kaushik Bhattacharya, Bamdad Hosseini, Nikola B Kovachki, and Andrew M Stuart. Model reduction and neural networks for parametric PDEs.The SMAI journal of computational math- ematics, 7:121–157, 2021

2021
[8]

Vector valued reproducing kernel Hilbert spaces and universality.Analysis and Applications, 8(01):19–61, 2010

Claudio Carmeli, Ernesto De Vito, Alessandro Toigo, and Veronica Umanit´ a. Vector valued reproducing kernel Hilbert spaces and universality.Analysis and Applications, 8(01):19–61, 2010

2010
[9]

Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural net- works with arbitrary activation functions and its application to dynamical systems.IEEE trans- actions on neural networks, 6(4):911–917, 1995

1995
[10]

Learning Fr´ echet differentiable op- erators via prespecified neural operators.Applied and Computational Harmonic Analysis, page 101878, 2026

Kun Cheng, Jun Fan, Linhao Song, and Ding-Xuan Zhou. Learning Fr´ echet differentiable op- erators via prespecified neural operators.Applied and Computational Harmonic Analysis, page 101878, 2026

2026
[11]

Vector Measures.American Mathematical Society, 1977

Joseph Diestel and John Jerry Uhl. Vector Measures.American Mathematical Society, 1977

1977
[12]

Spectral neural operators

Vladimir Sergeevich Fanaskov and Ivan V Oseledets. Spectral neural operators. InDoklady Mathematics, volume 108, pages S226–S232. Springer, 2023

2023
[13]

Multiwavelet-based operator learning for differ- ential equations.Advances in neural information processing systems, 34:24048–24062, 2021

Gaurav Gupta, Xiongye Xiao, and Paul Bogdan. Multiwavelet-based operator learning for differ- ential equations.Advances in neural information processing systems, 34:24048–24062, 2021

2021
[14]

Solving PDE-constrained control problems using operator learning

Rakhoon Hwang, Jae Yong Lee, Jin Young Shin, and Hyung Ju Hwang. Solving PDE-constrained control problems using operator learning. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4504–4512, 2022

2022
[15]

Ergebnisse der Mathematik und ihrer Gren- zgebiete

Tuomas Hyt¨ onen, Jan van Neerven, Mark Veraar, and Lutz Weis.Analysis in Banach Spaces, Volume I: Martingales and Littlewood-Paley Theory. Ergebnisse der Mathematik und ihrer Gren- zgebiete. 3. Folge. Springer, 2016

2016
[16]

Two-layer neural networks with values in a Banach space.SIAM Journal on Mathematical Analysis, 54(6):6358–6389, 2022

Yury Korolev. Two-layer neural networks with values in a Banach space.SIAM Journal on Mathematical Analysis, 54(6):6358–6389, 2022

2022
[17]

On universal approximation and error bounds for Fourier neural operators.Journal of Machine Learning Research, 22(290):1–76, 2021

Nikola Kovachki, Samuel Lanthaler, and Siddhartha Mishra. On universal approximation and error bounds for Fourier neural operators.Journal of Machine Learning Research, 22(290):1–76, 2021

2021
[18]

Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, An- drew Stuart, and Anima Anandkumar. Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

2023
[19]

Data complexity estimates for operator learning.arXiv preprint arXiv:2405.15992, 2024

Nikola B Kovachki, Samuel Lanthaler, and Hrushikesh Mhaskar. Data complexity estimates for operator learning.arXiv preprint arXiv:2405.15992, 2024

work page arXiv 2024
[20]

Springer Science & Business Media, 2012

Serge Lang.Real and Functional Analysis. Springer Science & Business Media, 2012

2012
[21]

Operator learning with PCA-Net: Upper and lower complexity bounds.Journal of Machine Learning Research, 24(318):1–67, 2023

Samuel Lanthaler. Operator learning with PCA-Net: Upper and lower complexity bounds.Journal of Machine Learning Research, 24(318):1–67, 2023. 12

2023
[22]

Error estimates for Deep- ONets: A deep learning framework in infinite dimensions.Transactions of Mathematics and its Applications, 6(1):tnac001, 2022

Samuel Lanthaler, Siddhartha Mishra, and George E Karniadakis. Error estimates for Deep- ONets: A deep learning framework in infinite dimensions.Transactions of Mathematics and its Applications, 6(1):tnac001, 2022

2022
[23]

The parametric complexity of operator learning.IMA Journal of Numerical Analysis, 46(2):647–712, 2026

Samuel Lanthaler and Andrew M Stuart. The parametric complexity of operator learning.IMA Journal of Numerical Analysis, 46(2):647–712, 2026

2026
[24]

Fourier neural operator for parametric partial differential equations

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, An- drew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations. InInternational Conference on Learning Representations, 2021

2021
[25]

Spectral Barron space for deep neural network approximation

Yulei Liao and Pingbing Ming. Spectral Barron space for deep neural network approximation. SIAM Journal on Mathematics of Data Science, 7(3), 2025

2025
[26]

Deep nonparametric esti- mation of operators between infinite dimensional spaces.Journal of Machine Learning Research, 25(24):1–67, 2024

Hao Liu, Haizhao Yang, Minshuo Chen, Tuo Zhao, and Wenjing Liao. Deep nonparametric esti- mation of operators between infinite dimensional spaces.Journal of Machine Learning Research, 25(24):1–67, 2024

2024
[27]

Neural Scaling Laws of Deep ReLU and Deep Operator Network: A Theoretical Study

Hao Liu, Zecheng Zhang, Wenjing Liao, and Hayden Schaeffer. Neural scaling laws of deep ReLU and deep operator network: A theoretical study.arXiv preprint arXiv:2410.00357, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[28]

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021

2021
[29]

Neural inverse operators for solving PDE inverse problems

Roberto Molinaro, Yunan Yang, Bj¨ orn Engquist, and Siddhartha Mishra. Neural inverse operators for solving PDE inverse problems. InInternational Conference on Machine Learning, pages 25105– 25139. PMLR, 2023

2023
[30]

Sloan, and Henryk Wo’zniakowski

Erich Novak, Ian H. Sloan, and Henryk Wo’zniakowski. Tractability of approximation for weighted Korobov spaces on classical and quantum computers.Foundations of Computational Mathematics, 4(2):121–156, 2004

2004
[31]

A function space view of bounded norm infinite width ReLU nets: The multivariate case

Greg Ongie, Rebecca Willett, Daniel Soudry, and Nathan Srebro. A function space view of bounded norm infinite width ReLU nets: The multivariate case. InInternational Conference on Learning Representations, 2020

2020
[32]

Rahul Parhi and Robert D. Nowak. Banach space representer theorems for neural networks and ridge splines.Journal of Machine Learning Research, 22, 2021

2021
[33]

Statistical learning theory for neural operators

Niklas Reinhardt, Sven Wang, and Jakob Zech. Statistical learning theory for neural operators. arXiv preprint arXiv:2412.17582, 2024

work page arXiv 2024
[34]

Deep operator network approximation rates for Lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

Christoph Schwab, Andreas Stein, and Jakob Zech. Deep operator network approximation rates for Lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

2026
[35]

Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ.Analysis and Applications, 17(01):19–55, 2019

Christoph Schwab and Jakob Zech. Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ.Analysis and Applications, 17(01):19–55, 2019

2019
[36]

Learning operators with stochastic gradient descent in general Hilbert spaces.arXiv preprint arXiv:2402.04691, 2024

Lei Shi and Jia-Qi Yang. Learning operators with stochastic gradient descent in general Hilbert spaces.arXiv preprint arXiv:2402.04691, 2024

work page arXiv 2024
[37]

High-order approximation rates for shallow neural networks with cosine and ReLU activation functions.Applied and Computational Harmonic Analysis, 58:1– 26, 2022

Jonathan W Siegel and Jinchao Xu. High-order approximation rates for shallow neural networks with cosine and ReLU activation functions.Applied and Computational Harmonic Analysis, 58:1– 26, 2022

2022
[38]

Sharp bounds on the approximation rates, metric entropy, and n-widths of shallow neural networks.Foundations of Computational Mathematics, 24(2):481–537, 2024

Jonathan W Siegel and Jinchao Xu. Sharp bounds on the approximation rates, metric entropy, and n-widths of shallow neural networks.Foundations of Computational Mathematics, 24(2):481–537, 2024. 13

2024
[39]

Approximation of smooth functionals using deep ReLU networks.Neural Networks, 166:424–436, 2023

Linhao Song, Ying Liu, Jun Fan, and Ding-Xuan Zhou. Approximation of smooth functionals using deep ReLU networks.Neural Networks, 166:424–436, 2023

2023
[40]

Stochastic Evolution Equations.ISEM lecture notes, 2008

Jan van Neerven. Stochastic Evolution Equations.ISEM lecture notes, 2008

2008
[41]

Long-time integration of parametric evolution equations with physics-informed DeepONets.Journal of Computational Physics, 475:111855, 2023

Sifan Wang and Paris Perdikaris. Long-time integration of parametric evolution equations with physics-informed DeepONets.Journal of Computational Physics, 475:111855, 2023

2023
[42]

A kernel-based stochastic approximation framework for nonlinear oper- ator learning.arXiv preprint arXiv:2509.11070, 2025

Jia-Qi Yang and Lei Shi. A kernel-based stochastic approximation framework for nonlinear oper- ator learning.arXiv preprint arXiv:2509.11070, 2025

work page arXiv 2025
[43]

Learning Operators by Regularized Stochastic Gradient Descent with Operator-valued Kernels

Jia-Qi Yang and Lei Shi. Learning operators by regularized stochastic gradient descent with operator-valued kernels.arXiv preprint arXiv:2504.18184, 2025. 14

work page internal anchor Pith review Pith/arXiv arXiv 2025

[1] [1]

Adcock, M

Ben Adcock, Michael Griebel, and Gregor Maier. The sample complexity of learning Lipschitz operators with respect to Gaussian measures.arXiv preprint arXiv:2410.23440, 2024

work page arXiv 2024

[2] [2]

Adcock, G

Ben Adcock, Gregor Maier, and Rahul Parhi. Towards sharp minimax risk bounds for operator learning.arXiv preprint arXiv:2512.17805, 2025

work page arXiv 2025

[3] [3]

Springer, 2006

Fernando Albiac and Nigel J Kalton.Topics in Banach Space Theory. Springer, 2006. 11

2006

[4] [4]

Neural operator: Graph kernel network for partial differ- ential equations

Anima Anandkumar, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Nikola Kovachki, Zongyi Li, Burigede Liu, and Andrew Stuart. Neural operator: Graph kernel network for partial differ- ential equations. InICLR 2020 workshop on integration of deep neural models and differential equations, 2020

2020

[5] [5]

Breaking the curse of dimensionality with convex neural networks.Journal of Machine Learning Research, 18(19):1–53, 2017

Francis Bach. Breaking the curse of dimensionality with convex neural networks.Journal of Machine Learning Research, 18(19):1–53, 2017

2017

[6] [6]

Universal approximation bounds for superpositions of a sigmoidal function

Andrew R Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information theory, 39(3):930–945, 2002

2002

[7] [7]

Model reduction and neural networks for parametric PDEs.The SMAI journal of computational math- ematics, 7:121–157, 2021

Kaushik Bhattacharya, Bamdad Hosseini, Nikola B Kovachki, and Andrew M Stuart. Model reduction and neural networks for parametric PDEs.The SMAI journal of computational math- ematics, 7:121–157, 2021

2021

[8] [8]

Vector valued reproducing kernel Hilbert spaces and universality.Analysis and Applications, 8(01):19–61, 2010

Claudio Carmeli, Ernesto De Vito, Alessandro Toigo, and Veronica Umanit´ a. Vector valued reproducing kernel Hilbert spaces and universality.Analysis and Applications, 8(01):19–61, 2010

2010

[9] [9]

Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural net- works with arbitrary activation functions and its application to dynamical systems.IEEE trans- actions on neural networks, 6(4):911–917, 1995

1995

[10] [10]

Learning Fr´ echet differentiable op- erators via prespecified neural operators.Applied and Computational Harmonic Analysis, page 101878, 2026

Kun Cheng, Jun Fan, Linhao Song, and Ding-Xuan Zhou. Learning Fr´ echet differentiable op- erators via prespecified neural operators.Applied and Computational Harmonic Analysis, page 101878, 2026

2026

[11] [11]

Vector Measures.American Mathematical Society, 1977

Joseph Diestel and John Jerry Uhl. Vector Measures.American Mathematical Society, 1977

1977

[12] [12]

Spectral neural operators

Vladimir Sergeevich Fanaskov and Ivan V Oseledets. Spectral neural operators. InDoklady Mathematics, volume 108, pages S226–S232. Springer, 2023

2023

[13] [13]

Multiwavelet-based operator learning for differ- ential equations.Advances in neural information processing systems, 34:24048–24062, 2021

Gaurav Gupta, Xiongye Xiao, and Paul Bogdan. Multiwavelet-based operator learning for differ- ential equations.Advances in neural information processing systems, 34:24048–24062, 2021

2021

[14] [14]

Solving PDE-constrained control problems using operator learning

Rakhoon Hwang, Jae Yong Lee, Jin Young Shin, and Hyung Ju Hwang. Solving PDE-constrained control problems using operator learning. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4504–4512, 2022

2022

[15] [15]

Ergebnisse der Mathematik und ihrer Gren- zgebiete

Tuomas Hyt¨ onen, Jan van Neerven, Mark Veraar, and Lutz Weis.Analysis in Banach Spaces, Volume I: Martingales and Littlewood-Paley Theory. Ergebnisse der Mathematik und ihrer Gren- zgebiete. 3. Folge. Springer, 2016

2016

[16] [16]

Two-layer neural networks with values in a Banach space.SIAM Journal on Mathematical Analysis, 54(6):6358–6389, 2022

Yury Korolev. Two-layer neural networks with values in a Banach space.SIAM Journal on Mathematical Analysis, 54(6):6358–6389, 2022

2022

[17] [17]

On universal approximation and error bounds for Fourier neural operators.Journal of Machine Learning Research, 22(290):1–76, 2021

Nikola Kovachki, Samuel Lanthaler, and Siddhartha Mishra. On universal approximation and error bounds for Fourier neural operators.Journal of Machine Learning Research, 22(290):1–76, 2021

2021

[18] [18]

Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, An- drew Stuart, and Anima Anandkumar. Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

2023

[19] [19]

Data complexity estimates for operator learning.arXiv preprint arXiv:2405.15992, 2024

Nikola B Kovachki, Samuel Lanthaler, and Hrushikesh Mhaskar. Data complexity estimates for operator learning.arXiv preprint arXiv:2405.15992, 2024

work page arXiv 2024

[20] [20]

Springer Science & Business Media, 2012

Serge Lang.Real and Functional Analysis. Springer Science & Business Media, 2012

2012

[21] [21]

Operator learning with PCA-Net: Upper and lower complexity bounds.Journal of Machine Learning Research, 24(318):1–67, 2023

Samuel Lanthaler. Operator learning with PCA-Net: Upper and lower complexity bounds.Journal of Machine Learning Research, 24(318):1–67, 2023. 12

2023

[22] [22]

Error estimates for Deep- ONets: A deep learning framework in infinite dimensions.Transactions of Mathematics and its Applications, 6(1):tnac001, 2022

Samuel Lanthaler, Siddhartha Mishra, and George E Karniadakis. Error estimates for Deep- ONets: A deep learning framework in infinite dimensions.Transactions of Mathematics and its Applications, 6(1):tnac001, 2022

2022

[23] [23]

The parametric complexity of operator learning.IMA Journal of Numerical Analysis, 46(2):647–712, 2026

Samuel Lanthaler and Andrew M Stuart. The parametric complexity of operator learning.IMA Journal of Numerical Analysis, 46(2):647–712, 2026

2026

[24] [24]

Fourier neural operator for parametric partial differential equations

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, An- drew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations. InInternational Conference on Learning Representations, 2021

2021

[25] [25]

Spectral Barron space for deep neural network approximation

Yulei Liao and Pingbing Ming. Spectral Barron space for deep neural network approximation. SIAM Journal on Mathematics of Data Science, 7(3), 2025

2025

[26] [26]

Deep nonparametric esti- mation of operators between infinite dimensional spaces.Journal of Machine Learning Research, 25(24):1–67, 2024

Hao Liu, Haizhao Yang, Minshuo Chen, Tuo Zhao, and Wenjing Liao. Deep nonparametric esti- mation of operators between infinite dimensional spaces.Journal of Machine Learning Research, 25(24):1–67, 2024

2024

[27] [27]

Neural Scaling Laws of Deep ReLU and Deep Operator Network: A Theoretical Study

Hao Liu, Zecheng Zhang, Wenjing Liao, and Hayden Schaeffer. Neural scaling laws of deep ReLU and deep operator network: A theoretical study.arXiv preprint arXiv:2410.00357, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[28] [28]

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021

2021

[29] [29]

Neural inverse operators for solving PDE inverse problems

Roberto Molinaro, Yunan Yang, Bj¨ orn Engquist, and Siddhartha Mishra. Neural inverse operators for solving PDE inverse problems. InInternational Conference on Machine Learning, pages 25105– 25139. PMLR, 2023

2023

[30] [30]

Sloan, and Henryk Wo’zniakowski

Erich Novak, Ian H. Sloan, and Henryk Wo’zniakowski. Tractability of approximation for weighted Korobov spaces on classical and quantum computers.Foundations of Computational Mathematics, 4(2):121–156, 2004

2004

[31] [31]

A function space view of bounded norm infinite width ReLU nets: The multivariate case

Greg Ongie, Rebecca Willett, Daniel Soudry, and Nathan Srebro. A function space view of bounded norm infinite width ReLU nets: The multivariate case. InInternational Conference on Learning Representations, 2020

2020

[32] [32]

Rahul Parhi and Robert D. Nowak. Banach space representer theorems for neural networks and ridge splines.Journal of Machine Learning Research, 22, 2021

2021

[33] [33]

Statistical learning theory for neural operators

Niklas Reinhardt, Sven Wang, and Jakob Zech. Statistical learning theory for neural operators. arXiv preprint arXiv:2412.17582, 2024

work page arXiv 2024

[34] [34]

Deep operator network approximation rates for Lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

Christoph Schwab, Andreas Stein, and Jakob Zech. Deep operator network approximation rates for Lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

2026

[35] [35]

Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ.Analysis and Applications, 17(01):19–55, 2019

Christoph Schwab and Jakob Zech. Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ.Analysis and Applications, 17(01):19–55, 2019

2019

[36] [36]

Learning operators with stochastic gradient descent in general Hilbert spaces.arXiv preprint arXiv:2402.04691, 2024

Lei Shi and Jia-Qi Yang. Learning operators with stochastic gradient descent in general Hilbert spaces.arXiv preprint arXiv:2402.04691, 2024

work page arXiv 2024

[37] [37]

High-order approximation rates for shallow neural networks with cosine and ReLU activation functions.Applied and Computational Harmonic Analysis, 58:1– 26, 2022

Jonathan W Siegel and Jinchao Xu. High-order approximation rates for shallow neural networks with cosine and ReLU activation functions.Applied and Computational Harmonic Analysis, 58:1– 26, 2022

2022

[38] [38]

Sharp bounds on the approximation rates, metric entropy, and n-widths of shallow neural networks.Foundations of Computational Mathematics, 24(2):481–537, 2024

Jonathan W Siegel and Jinchao Xu. Sharp bounds on the approximation rates, metric entropy, and n-widths of shallow neural networks.Foundations of Computational Mathematics, 24(2):481–537, 2024. 13

2024

[39] [39]

Approximation of smooth functionals using deep ReLU networks.Neural Networks, 166:424–436, 2023

Linhao Song, Ying Liu, Jun Fan, and Ding-Xuan Zhou. Approximation of smooth functionals using deep ReLU networks.Neural Networks, 166:424–436, 2023

2023

[40] [40]

Stochastic Evolution Equations.ISEM lecture notes, 2008

Jan van Neerven. Stochastic Evolution Equations.ISEM lecture notes, 2008

2008

[41] [41]

Long-time integration of parametric evolution equations with physics-informed DeepONets.Journal of Computational Physics, 475:111855, 2023

Sifan Wang and Paris Perdikaris. Long-time integration of parametric evolution equations with physics-informed DeepONets.Journal of Computational Physics, 475:111855, 2023

2023

[42] [42]

A kernel-based stochastic approximation framework for nonlinear oper- ator learning.arXiv preprint arXiv:2509.11070, 2025

Jia-Qi Yang and Lei Shi. A kernel-based stochastic approximation framework for nonlinear oper- ator learning.arXiv preprint arXiv:2509.11070, 2025

work page arXiv 2025

[43] [43]

Learning Operators by Regularized Stochastic Gradient Descent with Operator-valued Kernels

Jia-Qi Yang and Lei Shi. Learning operators by regularized stochastic gradient descent with operator-valued kernels.arXiv preprint arXiv:2504.18184, 2025. 14

work page internal anchor Pith review Pith/arXiv arXiv 2025