pith. sign in

arxiv: 2302.03286 · v3 · submitted 2023-02-07 · 🧮 math.NA · cs.NA· stat.ML

Algorithmically Designed Artificial Neural Networks (ADANNs): Higher order deep operator learning for parametric partial differential equations

Pith reviewed 2026-05-24 09:34 UTC · model grok-4.3

classification 🧮 math.NA cs.NAstat.ML
keywords ADANNsdeep operator learningparametric partial differential equationsartificial neural networksnumerical approximationoperator approximationinitialization schemes
0
0 comments X

The pith

ADANNs design neural network architectures and initializations to mimic classical numerical algorithms for better approximation of parametric PDE operators.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces Algorithmically Designed Artificial Neural Networks (ADANNs) that tailor both the structure of artificial neural networks and their starting values so the networks initially replicate a chosen classical numerical method for approximating operators tied to parametric partial differential equations. The approach merges those classical techniques with deep operator learning so that training begins from an already effective baseline rather than from random weights. Tests on multiple parametric PDEs show the resulting networks reach lower errors than both standard numerical schemes and prior deep learning operator methods. A reader would care if this hybrid starting point allows neural networks to solve families of PDE problems more accurately without needing vastly more data or compute.

Core claim

The paper establishes that customized ANN architectures together with specialized initializations that make the networks closely mimic efficient classical numerical algorithms at the outset produce ADANNs whose approximation quality for operators associated with parametric PDEs exceeds that of existing classical algorithms and existing deep operator learning approaches in the numerical examples considered.

What carries the argument

ADANNs are customized artificial neural network architectures equipped with initialization schemes that make the networks replicate a selected classical numerical algorithm for the target PDE operator approximation problem at the start of training.

If this is right

  • The ADANN construction can be applied to additional parametric PDEs to obtain higher-order operator approximations.
  • Specialized initializations derived from classical algorithms reduce the distance the network must travel during training.
  • The method consistently outperforms both pure classical schemes and prior deep operator learning techniques in the reported tests.
  • The same design principle of algorithmic mimicry at initialization extends to other operator-learning settings for differential equations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same initialization-by-mimicry idea could be tested on time-dependent or nonlinear PDE families not covered in the paper.
  • Automatic procedures that translate a classical solver into an ANN architecture might reduce the manual design effort required.
  • Because the networks begin closer to the solution manifold, they may require smaller training sets than purely data-driven operator learners.
  • The approach suggests a general route for injecting domain knowledge from numerical analysis directly into the starting weights of deep networks.

Load-bearing premise

That adapting ANN architectures and initializations to start by copying classical numerical algorithms will combine effectively with deep operator learning to deliver better accuracy on the parametric PDE problems.

What would settle it

A side-by-side numerical test on one of the parametric PDE examples in which the ADANN method produces equal or higher approximation error than a standard deep operator network or a classical solver after comparable training would falsify the performance advantage.

Figures

Figures reproduced from arXiv: 2302.03286 by Adrian Riekert, Arnulf Jentzen, Philippe von Wurstemberger.

Figure 1
Figure 1. Figure 1: Graphical illustration for the base model defined in ( [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Graphical illustration of the performance of the methods in Table [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the ADANN methodology without difference model (cf. Algorithm 2) with a grid-based black box optimizer applied to the approximation of the operator in (42) based on the Sine￾Gordon-type equation in (41) in the case d = 1. Left: Test errors of the base models prior to training as a function of the parameters used for initialization. Right: Test errors of the trained base models as a function… view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the ADANN methodology without difference model (cf. Algorithm 2) with our heuristic exploration-exploitation black box optimizer applied to the approximation of the operator in (42) mapping initial values to terminal values of the Sine-Gordon-type equation in (41) in the case d = 1. Left: Test errors of trained base models as a function of the parameters used for initialization. Increasing … view at source ↗
Figure 5
Figure 5. Figure 5: Example approximation plots for a randomly chosen initial value for the Sine-Gordon-type [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Graphical illustration of the performance of the methods in Table [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Illustration of the ADANN methodology without difference model (cf. Algorithm 2) with our heuristic exploration-exploitation black box optimizer applied to the approximation of the operator in (42) mapping initial values to terminal values of the Sine-Gordon-type equation in (41) in the case d = 2. Left: Test errors of trained base models as a function of parameters used for initialization. Increasing scat… view at source ↗
Figure 8
Figure 8. Figure 8: Example approximation plots for a randomly chosen initial value for the Sine-Gordon-type [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Graphical illustration of the performance of the methods in Table [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Illustration of the ADANN methodology with and without difference model (cf. Algorithms 1 and 2) applied to the approximation of the operator in (54) mapping initial values to terminal values of the viscous Burgers equation in (53). Left: Test errors of the base models prior to training as a function of the parameters used for initialization. Middle: Test errors of the trained base models as a function of… view at source ↗
Figure 11
Figure 11. Figure 11: Example approximation plots for a randomly chosen initial value for the viscous Burgers [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Graphical illustration for the base model defined in ( [PITH_FULL_IMAGE:figures/full_fig_p027_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Graphical illustration of the performance of the methods in Table [PITH_FULL_IMAGE:figures/full_fig_p028_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Illustration of the ADANN methodology with and without difference model (cf. Algorithms 1 and 2) applied to the approximation of the operator in (68) mapping source terms to terminal values of the reaction diffusion equation in (66). Left: Test errors of the base models prior to training as a function of the parameters used for initialization. Middle: Test errors of the trained base models as a function o… view at source ↗
Figure 15
Figure 15. Figure 15: Example approximation plots for a randomly chosen initial value for the reaction diffusion [PITH_FULL_IMAGE:figures/full_fig_p029_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Optimal learning rates for the base model introduced in Section [PITH_FULL_IMAGE:figures/full_fig_p032_16.png] view at source ↗
read the original abstract

In this article we propose a new deep learning approach to approximate operators related to parametric partial differential equations (PDEs). In particular, we introduce a new strategy to design specific artificial neural network (ANN) architectures in conjunction with specific ANN initialization schemes which are tailor-made for the particular approximation problem under consideration. In the proposed approach we combine efficient classical numerical approximation techniques with deep operator learning methodologies. Specifically, we introduce customized adaptions of existing ANN architectures together with specialized initializations for these ANN architectures so that at initialization we have that the ANNs closely mimic a chosen efficient classical numerical algorithm for the considered approximation problem. The obtained ANN architectures and their initialization schemes are thus strongly inspired by numerical algorithms as well as by popular deep learning methodologies from the literature and in that sense we refer to the introduced ANNs in conjunction with their tailor-made initialization schemes as Algorithmically Designed Artificial Neural Networks (ADANNs). We numerically test the proposed ADANN methodology in the case of several parametric PDEs. In the tested numerical examples the ADANN methodology significantly outperforms existing classical approximation algorithms as well as existing deep operator learning methodologies from the literature.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript proposes Algorithmically Designed Artificial Neural Networks (ADANNs) for approximating operators associated with parametric PDEs. The core idea is to construct customized ANN architectures and initialization schemes that are strongly inspired by efficient classical numerical algorithms, so that the network at initialization closely mimics a chosen classical solver, and then to combine this with deep operator learning techniques. Numerical tests on several parametric PDE examples are reported to show that the resulting ADANNs significantly outperform both classical approximation algorithms and prior deep operator learning methods from the literature.

Significance. If the reported empirical gains hold under scrutiny, the work offers a concrete mechanism for injecting classical numerical knowledge into the initialization and architecture of operator networks. This could improve training stability and final accuracy for parametric PDE problems where good classical discretizations already exist. The explicit construction of initializations that reproduce classical schemes is a methodological strength that distinguishes the approach from generic operator-learning architectures.

minor comments (2)
  1. The abstract states that ADANNs 'significantly outperform' both classical and existing deep-learning methods, yet provides no quantitative error tables, baseline descriptions, or statistical details. The full experimental section should include these to allow readers to assess the magnitude and robustness of the claimed gains.
  2. Notation for the operator-learning setting (input/output function spaces, parameter domain, etc.) should be introduced once in a dedicated preliminary section rather than scattered across the methodology and examples.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading and positive assessment of our manuscript, including the accurate summary of the ADANN approach and its empirical performance. We appreciate the recommendation for minor revision and the recognition of the methodological contribution in embedding classical numerical schemes into network architectures and initializations.

Circularity Check

0 steps flagged

No circularity: empirical proposal with independent numerical validation

full rationale

The paper proposes a methodological combination of classical numerical algorithms with deep operator networks via custom ANN architectures and initializations that mimic the classical solver at start. Central claims rest on reported numerical experiments comparing ADANN performance against both classical methods and prior deep operator approaches on parametric PDE test cases. No derivation chain, equation, or claim reduces by construction to a fitted input, self-citation, or self-definition; the approach is explicitly described as an engineering synthesis whose value is assessed externally via benchmarks. This matches the default expectation of a non-circular empirical methods paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based solely on the abstract, the central claim rests on the existence of efficient classical numerical algorithms that can be mimicked by ANNs and on the assumption that such mimicry plus training yields superior results; no specific free parameters or invented entities are detailed.

axioms (1)
  • domain assumption Existence of efficient classical numerical algorithms for the parametric PDE operator approximation problems under consideration
    The ADANN construction explicitly relies on choosing and mimicking such algorithms as the basis for ANN design and initialization.
invented entities (1)
  • ADANNs no independent evidence
    purpose: Neural network architectures and initializations designed to mimic classical numerical algorithms for parametric PDEs
    Newly introduced term and concept in the paper; no independent evidence provided in abstract.

pith-pipeline@v0.9.0 · 5745 in / 1280 out tokens · 26045 ms · 2026-05-24T09:34:59.427355+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

95 extracted references · 95 canonical work pages · 5 internal anchors

  1. [1]

    Anastassi, A. A. Constructing Runge–Kutta methods with the use of artificial neural networks. Neural Computing and Applications 25 , 1 (2014), 229–236

  2. [2]

    Practical Optimization: Algorithms and Engineering Applications

    Antoniou, A., and Lu, W.-S. Practical Optimization: Algorithms and Engineering Applications . Springer US, 2021. 33

  3. [3]

    Bar-Sinai, Y., Hoyer, S., Hickey, J., and Brenner, M. P. Learning data-driven discretiza- tions for partial differential equations. Proc. Natl. Acad. Sci. USA 116 , 31 (2019), 15344–15349

  4. [4]

    Numerical Methods for Nonlinear Partial Differential Equations

    Bartels, S. Numerical Methods for Nonlinear Partial Differential Equations . Springer Interna- tional Publishing, 2015

  5. [5]

    An overview on deep learning- based approximation methods for partial differential equations

    Beck, C., Hutzenthaler, M., Jentzen, A., and Kuckuck, B. An overview on deep learning- based approximation methods for partial differential equations. Discrete Contin. Dyn. Syst. Ser. B (2022) (2020)

  6. [6]

    S., and von Wurstemberger, P

    Becker, S., Jentzen, A., M ¨uller, M. S., and von Wurstemberger, P. Learning the random variables in Monte Carlo simulations with stochastic gradient descent: machine learning for parametric PDEs and financial derivative pricing. Math. Finance 34 , 1 (2024), 90–150

  7. [7]

    Dynamic programming

    Bellman, R. Dynamic programming. Science 153, 3731 (1966), 34–37

  8. [8]

    Blechschmidt, J., and Ernst, O. G. Three Ways to Solve Partial Differential Equations with Neural Networks–A Review. arXiv:2102.11802 (2021)

  9. [9]

    Recurrent neural networks as optimal mesh refinement strategies

    Bohn, J., and Feischl, M. Recurrent neural networks as optimal mesh refinement strategies. Computers & Mathematics with Applications 97 (2021), 61–76

  10. [10]

    Brandstetter, J., Berg, R. v. d., Welling, M., and Gupta, J. K. Clifford neural layers for PDE modeling. arXiv:2209.04934 (2022)

  11. [11]

    Brevis, I., Muga, I., and van der Zee, K. G. A machine-learning minimal-residual (ML-MRes) framework for goal-oriented finite element discretizations. Computers & Mathematics with Appli- cations 95 (2021), 186–199. Recent Advances in Least-Squares and Discontinuous Petrov–Galerkin Finite Element Methods

  12. [12]

    Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakan- tan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., Mc- Candlish, S., Radfo...

  13. [13]

    An introduction to semilinear evolution equations , vol

    Cazenave, T., and Haraux, A. An introduction to semilinear evolution equations , vol. 13 of Oxford Lecture Series in Mathematics and its Applications. The Clarendon Press, Oxford University Press, New York, 1998. Translated from the 1990 French original by Yvan Martel and revised by the authors

  14. [14]

    Deep operator learning lessens the curse of dimensionality for PDEs

    Chen, K., Wang, C., and Yang, H. Deep operator learning lessens the curse of dimensionality for PDEs. arXiv:2301.12227 (2023)

  15. [15]

    R., Scheinberg, K., and Vicente, L

    Conn, A. R., Scheinberg, K., and Vicente, L. N. Introduction to Derivative-Free Optimiza- tion. Society for Industrial and Applied Mathematics, 2009

  16. [16]

    ANN-based modeling of third order runge kutta method

    Dehghanpour, M., Rahati, A., and Dehghanian, E. ANN-based modeling of third order runge kutta method. Journal of Advanced Computer Science & Technology 4 , 1 (2015), 180–189

  17. [17]

    Recent advances in deep learning for speech research at Microsoft

    Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y., and Acero, A. Recent advances in deep learning for speech research at Microsoft. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013), pp. 8604–8608

  18. [18]

    Numerische Mathematik 2 , revised ed

    Deuflhard, P., and Bornemann, F. Numerische Mathematik 2 , revised ed. de Gruyter Lehrbuch. Walter de Gruyter & Co., Berlin, 2008. Gew¨ ohnliche Differentialgleichungen. 34

  19. [19]

    S., and Ray, D.Controlling oscillations in high-order Discontin- uous Galerkin schemes using artificial viscosity tuned by neural networks

    Discacciati, N., Hesthaven, J. S., and Ray, D.Controlling oscillations in high-order Discontin- uous Galerkin schemes using artificial viscosity tuned by neural networks. Journal of Computational Physics 409 (2020), 109304

  20. [20]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations (2021)

  21. [21]

    A., Brenner, M

    Dresdner, G., Kochkov, D., Norgaard, P., Zepeda-N ´u˜nez, L., Smith, J. A., Brenner, M. P., and Hoyer, S. Learning to correct spectral methods for simulating turbulent flows. arXiv:2207.00556 (2022)

  22. [22]

    Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

    E, W., Han, J., and Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communica- tions in Mathematics and Statistics 5 , 4 (2017), 349–380

  23. [23]

    Algorithms for Solving High Dimensional PDEs: From Non- linear Monte Carlo to Machine Learning

    E, W., Han, J., and Jentzen, A. Algorithms for Solving High Dimensional PDEs: From Non- linear Monte Carlo to Machine Learning. Nonlinearity 35 (2022) 278-310 (2020)

  24. [24]

    E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T. On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high- dimensional nonlinear backward stochastic differential equations. J. Sci. Comput. 79 , 3 (2019), 1534–1571

  25. [25]

    Partial Differ

    E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T.Multilevel Picard iterations for solving smooth semilinear parabolic heat equations. Partial Differ. Equ. Appl. 2 , 6 (2021), 80

  26. [26]

    One-parameter semigroups for linear evolution equations , vol

    Engel, K.-J., and Nagel, R. One-parameter semigroups for linear evolution equations , vol. 194 of Graduate Texts in Mathematics . Springer-Verlag, New York, 2000. With contributions by S. Brendle, M. Campiti, T. Hahn, G. Metafune, G. Nickel, D. Pallara, C. Perazzoli, A. Rhandi, S. Romanelli and R. Schnaubelt

  27. [27]

    J., and Chen, G

    Fidkowski, K. J., and Chen, G. Metric-based, goal-oriented mesh adaptation using machine learning. Journal of Computational Physics 426 (2021), 109957

  28. [28]

    A Posteriori Learning for Quasi-Geostrophic Turbulence Parametrization.Journal of Advances in Modeling Earth Systems 14, 11 (Nov

    Frezat, H., Le Sommer, J., Fablet, R., Balarac, G., and Lguensat, R. A Posteriori Learning for Quasi-Geostrophic Turbulence Parametrization.Journal of Advances in Modeling Earth Systems 14, 11 (Nov. 2022)

  29. [29]

    arXiv:2101.08068 (2021)

    Germain, M., Pham, H., and Warin, X.Neural networks-based algorithms for stochastic control and PDEs in finance. arXiv:2101.08068 (2021)

  30. [30]

    Learning to optimize multigrid PDE solvers

    Greenfeld, D., Galun, M., Basri, R., Yavneh, I., and Kimmel, R. Learning to optimize multigrid PDE solvers. In Proceedings of the 36th International Conference on Machine Learning (09–15 Jun 2019), K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97 of Proceedings of Machine Learning Research, PMLR, pp. 2415–2423

  31. [31]

    Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces

    Grohs, P., and Voigtlaender, F. Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces. Foundations of Compu- tational Mathematics (Jul 2023)

  32. [32]

    Convolutional Neural Networks for Steady Flow Approxima- tion

    Guo, X., Li, W., and Iorio, F. Convolutional Neural Networks for Steady Flow Approxima- tion. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA, 2016), KDD ’16, Association for Computing Machinery, p. 481–490

  33. [33]

    Solving ordinary differential equations

    Hairer, E., and Wanner, G. Solving ordinary differential equations. II , revised ed., vol. 14 of Springer Series in Computational Mathematics . Springer-Verlag, Berlin, 2010. Stiff and differential- algebraic problems. 35

  34. [34]

    Solving high-dimensional partial differential equations using deep learning

    Han, J., Jentzen, A., and E, W. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences 115 , 34 (2018), 8505–8510

  35. [35]

    scikit- optimize/scikit-optimize, 2021

    Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., and Shcherbatyi, I. scikit- optimize/scikit-optimize, 2021

  36. [36]

    The randomized information complexity of elliptic PDE

    Heinrich, S. The randomized information complexity of elliptic PDE. Journal of Complexity 22 , 2 (2006), 220–249

  37. [37]

    Monte Carlo Complexity of Parametric Integration

    Heinrich, S., and Sindambiwe, E. Monte Carlo Complexity of Parametric Integration. Journal of Complexity 15 , 3 (1999), 317–341

  38. [38]

    Multilevel CNNs for parametric PDEs

    Heiß, C., G¨uhring, I., and Eigel, M. Multilevel CNNs for parametric PDEs. arXiv:2304.00388 (2023)

  39. [39]

    Counterparty Risk Valuation: A Marked Branching Diffusion Approach

    Henry-Labordere, P. Counterparty Risk Valuation: A Marked Branching Diffusion Approach. arXiv:1203.2369 (2012)

  40. [40]

    Branching diffusion representation of semilinear pdes and monte carlo approximation

    Henry-Labordere, P., Oudjane, N., Tan, X., Touzi, N., Warin, X., et al. Branching diffusion representation of semilinear pdes and monte carlo approximation. In Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques(2019), vol. 55, Institut Henri Poincar´ e, pp. 184–210

  41. [41]

    E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T

    Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., and Kingsbury, B. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine 29 , 6 (2012), 82–97

  42. [42]

    Explicit exponential Runge-Kutta methods for semilinear parabolic problems

    Hochbruck, M., and Ostermann, A. Explicit exponential Runge-Kutta methods for semilinear parabolic problems. SIAM J. Numer. Anal. 43 , 3 (2005), 1069–1090

  43. [43]

    Learning Neural PDE Solvers with Convergence Guarantees

    Hsieh, J.-T., Zhao, S., Eismann, S., Mirabella, L., and Ermon, S. Learning Neural PDE Solvers with Convergence Guarantees. arXiv:1906.01200 (2019)

  44. [44]

    Learning Optimal Multigrid Smoothers via Neural Networks

    Huang, R., Li, R., and Xi, Y. Learning Optimal Multigrid Smoothers via Neural Networks. SIAM Journal on Scientific Computing 45 , 3 (2023), S199–S225

  45. [45]

    On fast simulation of dynamical system with neural vector enhanced numerical solver

    Huang, Z., Liang, S., Zhang, H., Yang, H., and Lin, L. On fast simulation of dynamical system with neural vector enhanced numerical solver. Scientific Reports 13, 1 (Sep 2023), 15254

  46. [46]

    A., and von Wurstemberger, P

    Hutzenthaler, M., Jentzen, A., Kruse, T., Nguyen, T. A., and von Wurstemberger, P. Overcoming the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations. Proc. A. 476, 2244 (2020), 20190630, 25

  47. [47]

    Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory

    Jentzen, A., Kuckuck, B., and von Wurstemberger, P. Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory. arXiv:2310.20360 (2023)

  48. [48]

    S., and S ¨uli, E

    Jovanovi´c, B. S., and S ¨uli, E. Analysis of Finite Difference Schemes . Springer London, 2014

  49. [49]

    E., Kevrekidis, I

    Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., and Yang, L. Physics-informed machine learning. Nature Reviews Physics 3 , 6 (2021), 422–440

  50. [50]

    Deep Multigrid: learning prolongation and restriction matrices

    Katrutsa, A., Daulbaev, T., and Oseledets, I. Deep Multigrid: learning prolongation and restriction matrices. arXiv:1711.03825 (2017)

  51. [51]

    Solving parametric PDE problems with artificial neural networks

    Khoo, Y., Lu, J., and Ying, L. Solving parametric PDE problems with artificial neural networks. European J. Appl. Math. 32 , 3 (2021), 421–435

  52. [52]

    A., Alieva, A., Wang, Q., Brenner, M

    Kochkov, D., Smith, J. A., Alieva, A., Wang, Q., Brenner, M. P., and Hoyer, S. Machine learning-accelerated computational fluid dynamics. Proc. Natl. Acad. Sci. USA 118 , 21 (2021), Paper No. e2101784118, 8. 36

  53. [53]

    Deep FDM: Enhanced finite difference methods by deep learning

    Kossaczk´a, T., Ehrhardt, M., and G ¨unther, M. Deep FDM: Enhanced finite difference methods by deep learning. Franklin Open 4 (2023), 100039

  54. [54]

    On universal approximation and error bounds for Fourier neural operators

    Kovachki, N., Lanthaler, S., and Mishra, S. On universal approximation and error bounds for Fourier neural operators. J. Mach. Learn. Res. 22 (2021), Paper No. [290], 76

  55. [55]

    Krizhevsky, A., Sutskever, I., and Hinton, G. E. ImageNet Classification with Deep Convo- lutional Neural Networks. In Advances in Neural Information Processing Systems (2012), F. Pereira, C. Burges, L. Bottou, and K. Weinberger, Eds., vol. 25, Curran Associates, Inc

  56. [56]

    Nonlinear reconstruction for operator learning of PDEs with discontinuities

    Lanthaler, S., Molinaro, R., Hadorn, P., and Mishra, S. Nonlinear reconstruction for operator learning of PDEs with discontinuities. arXiv:2210.01074 (2022)

  57. [57]

    LeVeque, R. J. Finite Difference Methods for Ordinary and Partial Differential Equations. Society for Industrial and Applied Mathematics, 2007

  58. [58]

    Z., Liu, B., and Anandkumar, A

    Li, Z., Huang, D. Z., Liu, B., and Anandkumar, A. Fourier neural operator with learned deformations for PDEs on general geometries. arXiv:2207.05209 (2022)

  59. [59]

    Neural Operator: Graph Kernel Network for Partial Differential Equations

    Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Graph kernel network for partial differential equations. arXiv:2003.03485 (2020)

  60. [60]

    Fourier neural operator for parametric partial differential equations

    Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Fourier neural operator for parametric partial differential equations. In Inter- national Conference on Learning Representations (2021)

  61. [61]

    Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons

    List, B., Chen, L.-W., and Thuerey, N. Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons. Journal of Fluid Mechanics 949 (2022), A25

  62. [62]

    N., and Brunton, S

    Liu, Y., Kutz, J. N., and Brunton, S. L. Hierarchical deep learning of multiscale differential equation time-steppers. Philos. Trans. Roy. Soc. A 380 , 2229 (2022), Paper No. 20210200, 17

  63. [63]

    Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3 , 3 (2021), 218–229

  64. [64]

    Lu, L., Meng, X., Cai, S., Mao, Z., Goswami, S., Zhang, Z., and Karniadakis, G. E. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Computer Methods in Applied Mechanics and Engineering 393 (2022), 114778

  65. [65]

    Subgrid modelling for two-dimensional turbulence using neural networks

    Maulik, R., San, O., Rasheed, A., and Vedula, P. Subgrid modelling for two-dimensional turbulence using neural networks. Journal of Fluid Mechanics 858 (2019), 122–144

  66. [66]

    A review of deep learning techniques for speech processing

    Mehrish, A., Majumder, N., Bhardwaj, R., Mihalcea, R., and Poria, S. A review of deep learning techniques for speech processing. arXiv:2305.00359 (2023)

  67. [67]

    A machine learning framework for data driven acceleration of computations of differ- ential equations

    Mishra, S. A machine learning framework for data driven acceleration of computations of differ- ential equations. Math. Eng. 1 , 1 (2019), 118–146

  68. [68]

    H., and Stuart, A

    Nelsen, N. H., and Stuart, A. M. The random feature model for input-output maps between Banach spaces. SIAM J. Sci. Comput. 43 , 5 (2021), A3212–A3243

  69. [69]

    Y., Penent, G., and Privault, N

    Nguwi, J. Y., Penent, G., and Privault, N. A fully nonlinear Feynman-Kac formula with derivatives of arbitrary orders. arXiv:2201.03882 (2022)

  70. [70]

    Tractability of Multivariate Problems: Standard information for functionals, vol

    Novak, E., and Wo´zniakowski, H. Tractability of Multivariate Problems: Standard information for functionals, vol. 12. European Mathematical Society, 2008. 37

  71. [71]

    Tractability of multivariate problems

    Novak, E., and Wo ´zniakowski, H. Tractability of multivariate problems. Vol. 1: Linear in- formation, vol. 6 of EMS Tracts in Mathematics . European Mathematical Society (EMS), Z¨ urich, 2008

  72. [72]

    GPT-4 Technical Report

    OpenAI. GPT-4 Technical Report. arXiv:2303.08774 (2023)

  73. [73]

    Learning Runge-Kutta integration schemes for ODE simulation and identification

    Ouala, S., Debreu, L., Pascual, A., Chapron, B., Collard, F., Gaultier, L., and Fablet, R. Learning Runge-Kutta integration schemes for ODE simulation and identification. arXiv:2105.04999 (2021)

  74. [74]

    Semigroups of linear operators and applications to partial differential equations , vol

    Pazy, A. Semigroups of linear operators and applications to partial differential equations , vol. 44 of Applied Mathematical Sciences. Springer-Verlag, New York, 1983

  75. [75]

    Mean-field neural networks: learning mappings on Wasserstein space

    Pham, H., and Warin, X. Mean-field neural networks: learning mappings on Wasserstein space. arXiv:2210.15179 (2022)

  76. [76]

    Pre-trained models for natural language processing: A survey

    Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., and Huang, X. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63, 10 (Sept. 2020), 1872–1897

  77. [77]

    Raissi, M., Perdikaris, P., and Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378 (2019), 686–707

  78. [78]

    Convolutional Neural Operators

    Raoni´c, B., Molinaro, R., Rohner, T., Mishra, S., and de Bezenac, E. Convolutional Neural Operators. arXiv:2302.01178 (2023)

  79. [79]

    Ray, D., and Hesthaven, J. S. An artificial neural network as a troubled-cell indicator. Journal of Computational Physics 367 (2018), 166–191

  80. [80]

    A primer in BERTology: What we know about how BERT works

    Rogers, A., Kovaleva, O., and Rumshisky, A. A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics 8 (2020), 842–866

Showing first 80 references.