pith. machine review for the scientific record. sign in

arxiv: 2512.01015 · v2 · submitted 2025-11-30 · 💻 cs.LG · math.DS· math.FA

Recognition: 2 theorem links

· Lean Theorem

Upper Approximation Bounds for Neural Oscillators

Authors on Pith no claims yet

Pith reviewed 2026-05-17 02:31 UTC · model grok-4.3

classification 💻 cs.LG math.DSmath.FA
keywords neural oscillatorsapproximation boundssecond-order dynamical systemsmultilayer perceptronscausal operatorserror scalingmachine learningODE-based models
0
0 comments X

The pith

Neural oscillators approximate stable second-order dynamical systems with error that scales polynomially in the inverse widths of two MLPs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes upper bounds on the approximation power of neural oscillators, which combine a second-order ODE with an MLP. It shows these models can approximate causal uniformly continuous operators on temporal functions and, more specifically, uniformly asymptotically incrementally stable second-order dynamical systems. The derived error bounds decrease polynomially as the widths of the MLPs grow, which the authors argue overcomes the curse of parametric complexity that typically demands exponentially more parameters for better accuracy. The same bounding technique extends directly to certain state-space models built from linear time-continuous complex recurrent networks followed by an MLP. Four numerical examples are used to check that the predicted convergence rates hold in practice.

Core claim

The neural oscillator consisting of a second-order ODE followed by a multilayer perceptron is considered. Its upper approximation bound for approximating causal and uniformly continuous operators between continuous temporal function spaces and that for approximating uniformly asymptotically incrementally stable second-order dynamical systems are derived. The established proof method of the approximation bound for approximating the causal continuous operators can also be directly applied to state-space models consisting of a linear time-continuous complex recurrent neural network followed by an MLP. Theoretical results reveal that the approximation error of the neural oscillator for the state

What carries the argument

A neural oscillator formed by solving a second-order ordinary differential equation and feeding the solution through an MLP, with approximation bounds obtained via uniform continuity and incremental stability properties of the target operators and systems.

If this is right

  • Approximation error for the dynamical systems decreases polynomially with the reciprocals of the MLP widths.
  • The same proof technique yields bounds for causal uniformly continuous operators between temporal function spaces.
  • The bounding method applies without change to state-space models that use a linear time-continuous complex RNN followed by an MLP.
  • Convergence rates of the two error bounds are confirmed by four numerical test cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Wider but not exponentially larger MLPs should suffice for high-accuracy modeling of many physical systems that satisfy the stability condition.
  • The architecture may offer efficiency advantages over generic recurrent networks when long-term causal dependencies dominate.
  • Similar polynomial bounds could be pursued for higher-order or non-stable systems to broaden the theoretical coverage.

Load-bearing premise

The target operators must be causal and uniformly continuous on spaces of continuous temporal functions, while the dynamical systems must be uniformly asymptotically incrementally stable second-order systems.

What would settle it

A numerical experiment on a uniformly asymptotically incrementally stable second-order system where the observed approximation error fails to decrease polynomially as the widths of the two MLPs are increased would falsify the claimed scaling.

Figures

Figures reproduced from arXiv: 2512.01015 by Konstantin M. Zuev, Michael Beer, Yong Xia, Zifeng Huang.

Figure 1
Figure 1. Figure 1: Neural oscillator approximation errors ˜ε [PITH_FULL_IMAGE:figures/full_fig_p013_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Neural oscillator approximation errors ˜ε [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
read the original abstract

Neural oscillators, originating from second-order ordinary differential equations (ODEs), have demonstrated strong performance in stably learning causal mappings between long-term sequences or continuous temporal functions, as well as in accurately approximating physical systems. However, theoretically quantifying the capacities of their neural network architectures remains a significant challenge. In this study, the neural oscillator consisting of a second-order ODE followed by a multilayer perceptron (MLP) is considered. Its upper approximation bound for approximating causal and uniformly continuous operators between continuous temporal function spaces and that for approximating uniformly asymptotically incrementally stable second-order dynamical systems are derived. The established proof method of the approximation bound for approximating the causal continuous operators can also be directly applied to state-space models consisting of a linear time-continuous complex recurrent neural network followed by an MLP. Theoretical results reveal that the approximation error of the neural oscillator for approximating the second-order dynamical systems scales polynomially with the reciprocals of the widths of two utilized MLPs, thus overcoming the curse of parametric complexity. The convergence rates of two established approximation error bounds are validated through four numerical cases. These results provide a robust theoretical foundation for the effective application of the neural oscillator in science and engineering.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper considers neural oscillators formed by a second-order ODE followed by an MLP. It derives upper bounds on the approximation error when these oscillators approximate causal and uniformly continuous operators between spaces of continuous temporal functions, and when they approximate uniformly asymptotically incrementally stable second-order dynamical systems. The bounds are shown to scale polynomially with the reciprocals of the widths of the two MLPs. The same proof technique is stated to apply directly to state-space models consisting of a linear time-continuous complex RNN followed by an MLP. Convergence rates of the two bounds are validated numerically on four example cases.

Significance. If the stated polynomial scaling holds under the given assumptions, the result supplies a concrete theoretical guarantee that neural-oscillator architectures can approximate stable dynamical systems without incurring an exponential dependence on the number of parameters. This would be a useful addition to the literature on approximation theory for neural networks applied to continuous-time systems and long-horizon causal mappings.

major comments (2)
  1. [Section deriving the bound for dynamical systems (around the statement of the main theorem)] The central claim that the approximation error for second-order dynamical systems scales polynomially in the reciprocals of the MLP widths rests on the uniform asymptotic incremental stability assumption. The manuscript should make explicit how the stability margin and the time horizon appear in the final bound (e.g., in the constant multiplying the polynomial term) to rule out a hidden exponential factor that would reintroduce the curse of complexity.
  2. [Paragraph discussing the extension to state-space models] The assertion that the causal-operator proof technique applies directly to linear time-continuous complex RNN state-space models requires a short but self-contained argument showing that the complex-valued linear dynamics remain causal and uniformly continuous on the relevant function spaces; without this step the extension is not yet load-bearing.
minor comments (2)
  1. [Numerical validation section] In the numerical experiments, state the precise widths of the two MLPs, the integration scheme for the ODE, and the precise error metric used to produce the reported convergence rates.
  2. [Preliminaries / notation] Clarify whether the uniform continuity of the target operator is with respect to the sup norm or another topology on the space of continuous temporal functions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and have revised the paper to incorporate the suggested clarifications.

read point-by-point responses
  1. Referee: [Section deriving the bound for dynamical systems (around the statement of the main theorem)] The central claim that the approximation error for second-order dynamical systems scales polynomially in the reciprocals of the MLP widths rests on the uniform asymptotic incremental stability assumption. The manuscript should make explicit how the stability margin and the time horizon appear in the final bound (e.g., in the constant multiplying the polynomial term) to rule out a hidden exponential factor that would reintroduce the curse of complexity.

    Authors: We agree that an explicit statement of the dependence is warranted for full transparency. The bound derived under uniform asymptotic incremental stability takes the form C(δ, T) ⋅ (1/w₁ + 1/w₂)^k, where the prefactor C(δ, T) depends on the stability margin δ and time horizon T but is independent of the MLP widths w₁, w₂ and contains no exponential dependence on the number of parameters. In the revised manuscript we will restate the main theorem with this dependence written out explicitly and add a short remark confirming that the polynomial scaling in the reciprocals of the widths is unaffected by δ or T. revision: yes

  2. Referee: [Paragraph discussing the extension to state-space models] The assertion that the causal-operator proof technique applies directly to linear time-continuous complex RNN state-space models requires a short but self-contained argument showing that the complex-valued linear dynamics remain causal and uniformly continuous on the relevant function spaces; without this step the extension is not yet load-bearing.

    Authors: We accept the referee’s observation. Although the underlying proof for causal uniformly continuous operators extends immediately once the linear dynamics are shown to map the relevant function spaces into themselves, we will add a concise, self-contained paragraph immediately after the statement of the extension. This paragraph will verify that the linear time-continuous complex RNN preserves causality (by the variation-of-constants formula) and uniform continuity (by the boundedness of the state-transition matrix on compact time intervals) on the space of continuous temporal functions. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper derives upper bounds on approximation error for causal uniformly continuous operators and for uniformly asymptotically incrementally stable second-order systems using the neural oscillator architecture (second-order ODE followed by MLP). These bounds are obtained by applying established proof techniques for causal operators directly to the stated assumptions of causality, uniform continuity on temporal function spaces, and incremental stability; the resulting polynomial scaling in the reciprocals of the two MLP widths follows from the analysis rather than being presupposed or fitted. No self-definitional reductions, fitted inputs relabeled as predictions, or load-bearing self-citations appear in the derivation. The numerical convergence checks are presented separately as validation and do not enter the theoretical claims.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 0 invented entities

The central claims rest on standard assumptions about the function spaces and stability properties, with no free parameters or new entities introduced in the abstract.

axioms (3)
  • domain assumption The neural oscillator consists of a second-order ODE followed by an MLP.
    Stated as the architecture considered in the study.
  • domain assumption Target operators are causal and uniformly continuous between continuous temporal function spaces.
    Required for the first approximation bound.
  • domain assumption Dynamical systems are uniformly asymptotically incrementally stable second-order systems.
    Required for the second approximation bound.

pith-pipeline@v0.9.0 · 5508 in / 1408 out tokens · 72693 ms · 2026-05-17T02:31:13.284140+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Upper Generalization Bounds for Neural Oscillators

    cs.LG 2026-03 conditional novelty 6.0

    Upper generalization bounds for neural oscillators scale polynomially with MLP size and time length, avoiding the curse of parametric complexity, with numerical validation on a Bouc-Wen nonlinear system.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · cited by 1 Pith paper · 5 internal anchors

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in ":" * " " * FUNCTION f...

  2. [2]

    , author Pachpatte, B

    author Ames, W.F. , author Pachpatte, B. , year 1997 . title Inequalities for differential and integral equations . volume volume 197 . publisher Elsevier

  3. [3]

    , year 1993

    author Barron, A.R. , year 1993 . title Universal approximation bounds for superpositions of a sigmoidal function . journal IEEE Transactions on Information theory volume 39 , pages 930--945

  4. [4]

    , author Butzmann, H.P

    author Beattie, R. , author Butzmann, H.P. , year 2013 . title Convergence structures and applications to functional analysis . publisher Springer Science & Business Media

  5. [5]

    , author Chua, L.O

    author Boyd, S. , author Chua, L.O. , author Desoer, C.A. , year 1984 . title Analytical foundations of volterra series . journal IMA Journal of Mathematical Control and Information volume 1 , pages 243--282

  6. [6]

    Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

    author Cho, K. , author Van Merri \"e nboer, B. , author Gulcehre, C. , author Bahdanau, D. , author Bougares, F. , author Schwenk, H. , author Bengio, Y. , year 2014 . title Learning phrase representations using rnn encoder-decoder for statistical machine translation . journal arXiv preprint arXiv:1406.1078

  7. [7]

    , author Li, X.D

    author Chow, T.W. , author Li, X.D. , year 2000 . title Modeling of continuous time dynamical systems with input by recurrent neural networks . journal IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications volume 47 , pages 575--578

  8. [8]

    , author Nakamura, Y

    author Funahashi, K.i. , author Nakamura, Y. , year 1993 . title Approximation of dynamical systems by continuous time recurrent neural networks . journal Neural networks volume 6 , pages 801--806

  9. [9]

    , author Grigoryeva, L

    author Gonon, L. , author Grigoryeva, L. , author Ortega, J.P. , year 2023 . title Approximation bounds for random neural networks and reservoir systems . journal The Annals of Applied Probability volume 33 , pages 28--69

  10. [10]

    , author Higham, D.J

    author Griffiths, D.F. , author Higham, D.J. , year 2010 . title Numerical methods for ordinary differential equations: initial value problems . volume volume 5 . publisher Springer

  11. [11]

    , author Ortega, J.P

    author Grigoryeva, L. , author Ortega, J.P. , year 2018 a. title Echo state networks are universal . journal Neural Networks volume 108 , pages 495--508

  12. [12]

    , author Ortega, J.P

    author Grigoryeva, L. , author Ortega, J.P. , year 2018 b. title Universal discrete-time reservoir computers with stochastic inputs and linear readouts using non-homogeneous state-affine systems . journal Journal of Machine Learning Research volume 19 , pages 1--40

  13. [13]

    Mamba: Linear-Time Sequence Modeling with Selective State Spaces

    author Gu, A. , author Dao, T. , year 2023 . title Mamba: Linear-time sequence modeling with selective state spaces . journal arXiv preprint arXiv:2312.00752

  14. [14]

    , author Dao, T

    author Gu, A. , author Dao, T. , author Ermon, S. , author Rudra, A. , author R \'e , C. , year 2020 . title Hippo: Recurrent memory with optimal polynomial projections . journal Advances in neural information processing systems volume 33 , pages 1474--1487

  15. [15]

    , author Goel, K

    author Gu, A. , author Goel, K. , author Gupta, A. , author R \'e , C. , year 2022 . title On the parameterization and initialization of diagonal state space models . journal Advances in Neural Information Processing Systems volume 35 , pages 35971--35983

  16. [16]

    Efficiently Modeling Long Sequences with Structured State Spaces

    author Gu, A. , author Goel, K. , author R \'e , C. , year 2021 . title Efficiently modeling long sequences with structured state spaces . journal arXiv preprint arXiv:2111.00396

  17. [17]

    , year 2019

    author Hanin, B. , year 2019 . title Universal function approximation by deep neural nets with bounded width and relu activations . journal Mathematics volume 7 , pages 992

  18. [18]

    Approximating Continuous Functions by ReLU Nets of Minimal Width

    author Hanin, B. , author Sellke, M. , year 2017 . title Approximating continuous functions by relu nets of minimal width . journal arXiv preprint arXiv:1710.11278

  19. [19]

    , author Raginsky, M

    author Hanson, J. , author Raginsky, M. , year 2020 . title Universal simulation of stable dynamical systems by recurrent neural nets , in: booktitle Learning for Dynamics and Control , organization PMLR . pp. pages 384--392

  20. [20]

    , author Zhang, X

    author He, K. , author Zhang, X. , author Ren, S. , author Sun, J. , year 2015 . title Delving deep into rectifiers: Surpassing human-level performance on imagenet classification , in: booktitle Proceedings of the IEEE international conference on computer vision , pp. pages 1026--1034

  21. [21]

    , author Schmidhuber, J

    author Hochreiter, S. , author Schmidhuber, J. , year 1997 . title Long short-term memory . journal Neural computation volume 9 , pages 1735--1780

  22. [22]

    , author Beer, M

    author Huang, Z. , author Beer, M. , year 2024 . title Probability distributions for dynamic and extreme responses of linear elastic structures under quasi-stationary harmonizable loads . journal Probabilistic Engineering Mechanics volume 75 , pages 103590

  23. [23]

    , author Xia, Y

    author Huang, Z. , author Xia, Y. , year 2025 . title Universal runge–kutta neural oscillator for stochastic response analysis of nonlinear dynamic systems under random loads . journal Journal of Engineering Mechanics volume 151 , pages 04025033

  24. [24]

    , author Szegedy, C

    author Ioffe, S. , author Szegedy, C. , year 2015 . title Batch normalization: Accelerating deep network training by reducing internal covariate shift , in: booktitle International conference on machine learning , organization pmlr . pp. pages 448--456

  25. [25]

    Adam: A Method for Stochastic Optimization

    author Kingma, D.P. , author Ba, J. , year 2014 . title Adam: A method for stochastic optimization . journal arXiv preprint arXiv:1412.6980

  26. [26]

    , year 1956

    author Knopp, K. , year 1956 . title Infinite sequences and series . publisher Courier Corporation

  27. [27]

    , author Li, Z

    author Kovachki, N. , author Li, Z. , author Liu, B. , author Azizzadenesheli, K. , author Bhattacharya, K. , author Stuart, A. , author Anandkumar, A. , year 2023 . title Neural operator: Learning maps between function spaces with applications to pdes . journal Journal of Machine Learning Research volume 24 , pages 1--97

  28. [28]

    , author Furuya, T

    author Kratsios, A. , author Furuya, T. , author Benitez, J.A.L. , author Lassas, M. , author de Hoop, M. , year 2024 . title Mixture of experts soften the curse of dimensionality in operator learning . journal arXiv preprint arXiv:2404.09101

  29. [29]

    , author Papon, L

    author Kratsios, A. , author Papon, L. , year 2022 . title Universal approximation theorems for differentiable geometric deep learning . journal Journal of Machine Learning Research volume 23 , pages 1--73

  30. [30]

    , year 2023

    author Lanthaler, S. , year 2023 . title Operator learning with pca-net: upper and lower complexity bounds . journal Journal of Machine Learning Research volume 24 , pages 1--67

  31. [31]

    , year 2024

    author Lanthaler, S. , year 2024 . title Operator learning of lipschitz operators: An information-theoretic perspective . journal arXiv preprint arXiv:2406.18794

  32. [32]

    , author Rusch, T.K

    author Lanthaler, S. , author Rusch, T.K. , author Mishra, S. , year 2023 . title Neural oscillators are universal . journal Advances in Neural Information Processing Systems volume 36 , pages 46786--46806

  33. [33]

    , author Stuart, A.M

    author Lanthaler, S. , author Stuart, A.M. , year 2025 . title The parametric complexity of operator learning . journal IMA Journal of Numerical Analysis , pages draf028

  34. [34]

    , author Han, J

    author Li, Z. , author Han, J. , author E, W. , author Li, Q. , year 2022 . title Approximation and optimization theory for linear continuous-time recurrent neural networks . journal Journal of Machine Learning Research volume 23 , pages 1--85 . http://jmlr.org/papers/v23/21-0368.html

  35. [35]

    , author Yang, H

    author Liu, H. , author Yang, H. , author Chen, M. , author Zhao, T. , author Liao, W. , year 2024 . title Deep nonparametric estimation of operators between infinite dimensional spaces . journal Journal of Machine Learning Research volume 25 , pages 1--67

  36. [36]

    , author Pinkus, A

    author Maiorov, V. , author Pinkus, A. , year 1999 . title Lower bounds for approximation by mlp neural networks . journal Neurocomputing volume 25 , pages 81--91

  37. [37]

    , author Orvieto, A

    author Muca Cirone, N. , author Orvieto, A. , author Walker, B. , author Salvi, C. , author Lyons, T. , year 2024 . title Theoretical foundations of deep selective state-space models . journal Advances in Neural Information Processing Systems volume 37 , pages 127226--127272

  38. [38]

    , author Hinton, G.E

    author Nair, V. , author Hinton, G.E. , year 2010 . title Rectified linear units improve restricted boltzmann machines , in: booktitle Proceedings of the 27th international conference on machine learning (ICML-10) , pp. pages 807--814

  39. [39]

    , author De, S

    author Orvieto, A. , author De, S. , author Gulcehre, C. , author Pascanu, R. , author Smith, S.L. , year 2024 . title Universality of linear recurrences followed by non-linear projections: Finite-width guarantees and benefits of complex eigenvalues , in: booktitle International Conference on Machine Learning , organization PMLR . pp. pages 38837--38863

  40. [40]

    , author Mikolov, T

    author Pascanu, R. , author Mikolov, T. , author Bengio, Y. , year 2013 . title On the difficulty of training recurrent neural networks , in: booktitle International conference on machine learning , organization Pmlr . pp. pages 1310--1318

  41. [41]

    , author Gross, S

    author Paszke, A. , author Gross, S. , author Massa, F. , author Lerer, A. , author Bradbury, J. , author Chanan, G. , author Killeen, T. , author Lin, Z. , author Gimelshein, N. , author Antiga, L. , et al., year 2019 . title Pytorch: An imperative style, high-performance deep learning library . journal Advances in neural information processing systems volume 32

  42. [42]

    , author Lumbroso, E

    author Ran-Milo, Y. , author Lumbroso, E. , author Cohen-Karlik, E. , author Giryes, R. , author Globerson, A. , author Cohen, N. , year 2024 . title Provable benefits of complex parameterizations for structured state space models . journal Advances in Neural Information Processing Systems volume 37 , pages 115906--115939

  43. [43]

    , author Hinton, G.E

    author Rumelhart, D.E. , author Hinton, G.E. , author Williams, R.J. , year 1986 . title Learning representations by back-propagating errors . journal nature volume 323 , pages 533--536

  44. [44]

    , author Mishra, S

    author Rusch, T.K. , author Mishra, S. , year 2020 . title Coupled oscillatory recurrent neural network (cornn): An accurate and (gradient) stable architecture for learning long time dependencies . journal arXiv preprint arXiv:2010.00951

  45. [45]

    , author Mishra, S

    author Rusch, T.K. , author Mishra, S. , year 2021 . title Unicornn: A recurrent model for learning very long time dependencies , in: booktitle International Conference on Machine Learning , organization PMLR . pp. pages 9168--9178

  46. [46]

    , author Rus, D

    author Rusch, T.K. , author Rus, D. , year 2024 . title Oscillatory state-space models . journal arXiv preprint arXiv:2410.03943

  47. [47]

    , author Stein, A

    author Schwab, C. , author Stein, A. , author Zech, J. , year 2023 . title Deep operator network approximation rates for lipschitz operators . journal arXiv preprint arXiv:2307.09835

  48. [48]

    , author Yang, H

    author Shen, Z. , author Yang, H. , author Zhang, S. , year 2021 . title Neural network approximation: Three hidden layers are enough . journal Neural Networks volume 141 , pages 160--173

  49. [49]

    , year 2024

    author Strogatz, S.H. , year 2024 . title Nonlinear dynamics and chaos: with applications to physics, biology, chemistry, and engineering . publisher Chapman and Hall/CRC

  50. [50]

    , author Casoni, M

    author Tiezzi, M. , author Casoni, M. , author Betti, A. , author Guidi, T. , author Gori, M. , author Melacci, S. , year 2025 . title Back to recurrent processing at the crossroad of transformers and state-space models . journal Nature Machine Intelligence , pages 1--11

  51. [51]

    , year 2007

    author Van Handel, R. , year 2007 . title Filtering, stability, and robustness . Ph.D. thesis. California Institute of Technology

  52. [52]

    , author Shazeer, N

    author Vaswani, A. , author Shazeer, N. , author Parmar, N. , author Uszkoreit, J. , author Jones, L. , author Gomez, A.N. , author Kaiser, . , author Polosukhin, I. , year 2017 . title Attention is all you need . journal Advances in neural information processing systems volume 30

  53. [53]

    , author Xue, B

    author Wang, S. , author Xue, B. , year 2023 . title State-space models with layer-wise nonlinearity are universal approximators with exponential decaying memory . journal Advances in Neural Information Processing Systems volume 36 , pages 74021--74038

  54. [54]

    , year 2017

    author Yarotsky, D. , year 2017 . title Error bounds for approximations with deep relu networks . journal Neural networks volume 94 , pages 103--114

  55. [55]

    , year 2021

    author Yarotsky, D. , year 2021 . title Elementary superexpressive activations , in: booktitle International conference on machine learning , organization PMLR . pp. pages 11932--11940

  56. [56]

    , author Stinchcombe, M

    author Yukich, J. , author Stinchcombe, M. , author White, H. , year 1995 . title Sup-norm approximation bounds for networks through probabilistic methods . journal IEEE Transactions on Information Theory volume 41 , pages 1021--1027 . :10.1109/18.391247

  57. [57]

    , author Shen, Z

    author Zhang, S. , author Shen, Z. , author Yang, H. , year 2022 . title Deep network approximation: Achieving arbitrary accuracy with fixed number of neurons . journal Journal of Machine Learning Research volume 23 , pages 1--60