Algorithmically Designed Artificial Neural Networks (ADANNs): Higher order deep operator learning for parametric partial differential equations

Adrian Riekert; Arnulf Jentzen; Philippe von Wurstemberger

arxiv: 2302.03286 · v3 · submitted 2023-02-07 · 🧮 math.NA · cs.NA· stat.ML

Algorithmically Designed Artificial Neural Networks (ADANNs): Higher order deep operator learning for parametric partial differential equations

Arnulf Jentzen , Adrian Riekert , Philippe von Wurstemberger This is my paper

Pith reviewed 2026-05-24 09:34 UTC · model grok-4.3

classification 🧮 math.NA cs.NAstat.ML

keywords ADANNsdeep operator learningparametric partial differential equationsartificial neural networksnumerical approximationoperator approximationinitialization schemes

0 comments

The pith

ADANNs design neural network architectures and initializations to mimic classical numerical algorithms for better approximation of parametric PDE operators.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces Algorithmically Designed Artificial Neural Networks (ADANNs) that tailor both the structure of artificial neural networks and their starting values so the networks initially replicate a chosen classical numerical method for approximating operators tied to parametric partial differential equations. The approach merges those classical techniques with deep operator learning so that training begins from an already effective baseline rather than from random weights. Tests on multiple parametric PDEs show the resulting networks reach lower errors than both standard numerical schemes and prior deep learning operator methods. A reader would care if this hybrid starting point allows neural networks to solve families of PDE problems more accurately without needing vastly more data or compute.

Core claim

The paper establishes that customized ANN architectures together with specialized initializations that make the networks closely mimic efficient classical numerical algorithms at the outset produce ADANNs whose approximation quality for operators associated with parametric PDEs exceeds that of existing classical algorithms and existing deep operator learning approaches in the numerical examples considered.

What carries the argument

ADANNs are customized artificial neural network architectures equipped with initialization schemes that make the networks replicate a selected classical numerical algorithm for the target PDE operator approximation problem at the start of training.

If this is right

The ADANN construction can be applied to additional parametric PDEs to obtain higher-order operator approximations.
Specialized initializations derived from classical algorithms reduce the distance the network must travel during training.
The method consistently outperforms both pure classical schemes and prior deep operator learning techniques in the reported tests.
The same design principle of algorithmic mimicry at initialization extends to other operator-learning settings for differential equations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same initialization-by-mimicry idea could be tested on time-dependent or nonlinear PDE families not covered in the paper.
Automatic procedures that translate a classical solver into an ANN architecture might reduce the manual design effort required.
Because the networks begin closer to the solution manifold, they may require smaller training sets than purely data-driven operator learners.
The approach suggests a general route for injecting domain knowledge from numerical analysis directly into the starting weights of deep networks.

Load-bearing premise

That adapting ANN architectures and initializations to start by copying classical numerical algorithms will combine effectively with deep operator learning to deliver better accuracy on the parametric PDE problems.

What would settle it

A side-by-side numerical test on one of the parametric PDE examples in which the ADANN method produces equal or higher approximation error than a standard deep operator network or a classical solver after comparable training would falsify the performance advantage.

Figures

Figures reproduced from arXiv: 2302.03286 by Adrian Riekert, Arnulf Jentzen, Philippe von Wurstemberger.

**Figure 2.** Figure 2: Graphical illustration of the performance of the methods in Table [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the ADANN methodology without difference model (cf. Algorithm 2) with a grid-based black box optimizer applied to the approximation of the operator in (42) based on the SineGordon-type equation in (41) in the case d = 1. Left: Test errors of the base models prior to training as a function of the parameters used for initialization. Right: Test errors of the trained base models as a function… view at source ↗

**Figure 4.** Figure 4: Illustration of the ADANN methodology without difference model (cf. Algorithm 2) with our heuristic exploration-exploitation black box optimizer applied to the approximation of the operator in (42) mapping initial values to terminal values of the Sine-Gordon-type equation in (41) in the case d = 1. Left: Test errors of trained base models as a function of the parameters used for initialization. Increasing … view at source ↗

**Figure 5.** Figure 5: Example approximation plots for a randomly chosen initial value for the Sine-Gordon-type [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗

**Figure 6.** Figure 6: Graphical illustration of the performance of the methods in Table [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: Illustration of the ADANN methodology without difference model (cf. Algorithm 2) with our heuristic exploration-exploitation black box optimizer applied to the approximation of the operator in (42) mapping initial values to terminal values of the Sine-Gordon-type equation in (41) in the case d = 2. Left: Test errors of trained base models as a function of parameters used for initialization. Increasing scat… view at source ↗

**Figure 8.** Figure 8: Example approximation plots for a randomly chosen initial value for the Sine-Gordon-type [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗

**Figure 9.** Figure 9: Graphical illustration of the performance of the methods in Table [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗

**Figure 10.** Figure 10: Illustration of the ADANN methodology with and without difference model (cf. Algorithms 1 and 2) applied to the approximation of the operator in (54) mapping initial values to terminal values of the viscous Burgers equation in (53). Left: Test errors of the base models prior to training as a function of the parameters used for initialization. Middle: Test errors of the trained base models as a function of… view at source ↗

**Figure 11.** Figure 11: Example approximation plots for a randomly chosen initial value for the viscous Burgers [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗

**Figure 12.** Figure 12: Graphical illustration for the base model defined in ( [PITH_FULL_IMAGE:figures/full_fig_p027_12.png] view at source ↗

**Figure 13.** Figure 13: Graphical illustration of the performance of the methods in Table [PITH_FULL_IMAGE:figures/full_fig_p028_13.png] view at source ↗

**Figure 14.** Figure 14: Illustration of the ADANN methodology with and without difference model (cf. Algorithms 1 and 2) applied to the approximation of the operator in (68) mapping source terms to terminal values of the reaction diffusion equation in (66). Left: Test errors of the base models prior to training as a function of the parameters used for initialization. Middle: Test errors of the trained base models as a function o… view at source ↗

**Figure 15.** Figure 15: Example approximation plots for a randomly chosen initial value for the reaction diffusion [PITH_FULL_IMAGE:figures/full_fig_p029_15.png] view at source ↗

**Figure 16.** Figure 16: Optimal learning rates for the base model introduced in Section [PITH_FULL_IMAGE:figures/full_fig_p032_16.png] view at source ↗

read the original abstract

In this article we propose a new deep learning approach to approximate operators related to parametric partial differential equations (PDEs). In particular, we introduce a new strategy to design specific artificial neural network (ANN) architectures in conjunction with specific ANN initialization schemes which are tailor-made for the particular approximation problem under consideration. In the proposed approach we combine efficient classical numerical approximation techniques with deep operator learning methodologies. Specifically, we introduce customized adaptions of existing ANN architectures together with specialized initializations for these ANN architectures so that at initialization we have that the ANNs closely mimic a chosen efficient classical numerical algorithm for the considered approximation problem. The obtained ANN architectures and their initialization schemes are thus strongly inspired by numerical algorithms as well as by popular deep learning methodologies from the literature and in that sense we refer to the introduced ANNs in conjunction with their tailor-made initialization schemes as Algorithmically Designed Artificial Neural Networks (ADANNs). We numerically test the proposed ADANN methodology in the case of several parametric PDEs. In the tested numerical examples the ADANN methodology significantly outperforms existing classical approximation algorithms as well as existing deep operator learning methodologies from the literature.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a recipe for initializing neural operators so they start by copying a chosen classical numerical scheme for parametric PDEs, then trains them, with reported gains over both pure classical and standard deep operator methods.

read the letter

The main takeaway is that they have a practical way to make the starting point of the network match a classical solver for the operator approximation task. They adapt existing ANN architectures and set the initial parameters so that at the beginning the network already reproduces the output of the chosen numerical algorithm for the parametric PDE. Training then refines it from there. This is the core of what they call ADANNs. It is a direct attempt to merge the structure from numerical analysis with the flexibility of deep operator learning. The paper shows this on several parametric PDE examples and states that the resulting networks beat both the original classical methods and prior deep operator approaches in those tests. That combination of architecture choice plus initialization is the concrete new piece; most prior work either uses off-the-shelf nets or adds physics terms to the loss, but here the mimicry is built into the weights and structure from the start. The approach makes sense when a reliable classical scheme already exists, because you get a good warm start without having to discover it through training alone. The numerical evidence is the part that needs scrutiny. The abstract claims clear outperformance, but the strength of that claim rests on how the baselines were implemented, what error norms were used, and whether the classical methods were run at comparable cost or resolution. If the tests are clean and the gains hold under those controls, the idea is useful for people who already have a working classical code and want to improve it with limited extra data or compute. If the gains shrink once the classical baselines are tuned equally hard, the advantage is smaller. The method also inherits the limitation that you need a classical algorithm worth mimicking in the first place, so it is not a general replacement for cases where no efficient classical method is known. Readers working on operator learning or hybrid scientific computing will get the most out of it; the idea is straightforward enough that a referee can check the implementation details and the fairness of the comparisons without needing months of background. It is worth sending out for review rather than desk rejecting.

Referee Report

0 major / 2 minor

Summary. The manuscript proposes Algorithmically Designed Artificial Neural Networks (ADANNs) for approximating operators associated with parametric PDEs. The core idea is to construct customized ANN architectures and initialization schemes that are strongly inspired by efficient classical numerical algorithms, so that the network at initialization closely mimics a chosen classical solver, and then to combine this with deep operator learning techniques. Numerical tests on several parametric PDE examples are reported to show that the resulting ADANNs significantly outperform both classical approximation algorithms and prior deep operator learning methods from the literature.

Significance. If the reported empirical gains hold under scrutiny, the work offers a concrete mechanism for injecting classical numerical knowledge into the initialization and architecture of operator networks. This could improve training stability and final accuracy for parametric PDE problems where good classical discretizations already exist. The explicit construction of initializations that reproduce classical schemes is a methodological strength that distinguishes the approach from generic operator-learning architectures.

minor comments (2)

The abstract states that ADANNs 'significantly outperform' both classical and existing deep-learning methods, yet provides no quantitative error tables, baseline descriptions, or statistical details. The full experimental section should include these to allow readers to assess the magnitude and robustness of the claimed gains.
Notation for the operator-learning setting (input/output function spaces, parameter domain, etc.) should be introduced once in a dedicated preliminary section rather than scattered across the methodology and examples.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading and positive assessment of our manuscript, including the accurate summary of the ADANN approach and its empirical performance. We appreciate the recommendation for minor revision and the recognition of the methodological contribution in embedding classical numerical schemes into network architectures and initializations.

Circularity Check

0 steps flagged

No circularity: empirical proposal with independent numerical validation

full rationale

The paper proposes a methodological combination of classical numerical algorithms with deep operator networks via custom ANN architectures and initializations that mimic the classical solver at start. Central claims rest on reported numerical experiments comparing ADANN performance against both classical methods and prior deep operator approaches on parametric PDE test cases. No derivation chain, equation, or claim reduces by construction to a fitted input, self-citation, or self-definition; the approach is explicitly described as an engineering synthesis whose value is assessed externally via benchmarks. This matches the default expectation of a non-circular empirical methods paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based solely on the abstract, the central claim rests on the existence of efficient classical numerical algorithms that can be mimicked by ANNs and on the assumption that such mimicry plus training yields superior results; no specific free parameters or invented entities are detailed.

axioms (1)

domain assumption Existence of efficient classical numerical algorithms for the parametric PDE operator approximation problems under consideration
The ADANN construction explicitly relies on choosing and mimicking such algorithms as the basis for ANN design and initialization.

invented entities (1)

ADANNs no independent evidence
purpose: Neural network architectures and initializations designed to mimic classical numerical algorithms for parametric PDEs
Newly introduced term and concept in the paper; no independent evidence provided in abstract.

pith-pipeline@v0.9.0 · 5745 in / 1280 out tokens · 26045 ms · 2026-05-24T09:34:59.427355+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

95 extracted references · 95 canonical work pages · 5 internal anchors

[1]

Anastassi, A. A. Constructing Runge–Kutta methods with the use of artificial neural networks. Neural Computing and Applications 25 , 1 (2014), 229–236

work page 2014
[2]

Practical Optimization: Algorithms and Engineering Applications

Antoniou, A., and Lu, W.-S. Practical Optimization: Algorithms and Engineering Applications . Springer US, 2021. 33

work page 2021
[3]

Bar-Sinai, Y., Hoyer, S., Hickey, J., and Brenner, M. P. Learning data-driven discretiza- tions for partial differential equations. Proc. Natl. Acad. Sci. USA 116 , 31 (2019), 15344–15349

work page 2019
[4]

Numerical Methods for Nonlinear Partial Differential Equations

Bartels, S. Numerical Methods for Nonlinear Partial Differential Equations . Springer Interna- tional Publishing, 2015

work page 2015
[5]

An overview on deep learning- based approximation methods for partial differential equations

Beck, C., Hutzenthaler, M., Jentzen, A., and Kuckuck, B. An overview on deep learning- based approximation methods for partial differential equations. Discrete Contin. Dyn. Syst. Ser. B (2022) (2020)

work page 2022
[6]

S., and von Wurstemberger, P

Becker, S., Jentzen, A., M ¨uller, M. S., and von Wurstemberger, P. Learning the random variables in Monte Carlo simulations with stochastic gradient descent: machine learning for parametric PDEs and financial derivative pricing. Math. Finance 34 , 1 (2024), 90–150

work page 2024
[7]

Dynamic programming

Bellman, R. Dynamic programming. Science 153, 3731 (1966), 34–37

work page 1966
[8]

Blechschmidt, J., and Ernst, O. G. Three Ways to Solve Partial Differential Equations with Neural Networks–A Review. arXiv:2102.11802 (2021)

work page arXiv 2021
[9]

Recurrent neural networks as optimal mesh refinement strategies

Bohn, J., and Feischl, M. Recurrent neural networks as optimal mesh refinement strategies. Computers & Mathematics with Applications 97 (2021), 61–76

work page 2021
[10]

Brandstetter, J., Berg, R. v. d., Welling, M., and Gupta, J. K. Clifford neural layers for PDE modeling. arXiv:2209.04934 (2022)

work page arXiv 2022
[11]

Brevis, I., Muga, I., and van der Zee, K. G. A machine-learning minimal-residual (ML-MRes) framework for goal-oriented finite element discretizations. Computers & Mathematics with Appli- cations 95 (2021), 186–199. Recent Advances in Least-Squares and Discontinuous Petrov–Galerkin Finite Element Methods

work page 2021
[12]

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakan- tan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., Mc- Candlish, S., Radfo...

work page internal anchor Pith review Pith/arXiv arXiv 2005
[13]

An introduction to semilinear evolution equations , vol

Cazenave, T., and Haraux, A. An introduction to semilinear evolution equations , vol. 13 of Oxford Lecture Series in Mathematics and its Applications. The Clarendon Press, Oxford University Press, New York, 1998. Translated from the 1990 French original by Yvan Martel and revised by the authors

work page 1998
[14]

Deep operator learning lessens the curse of dimensionality for PDEs

Chen, K., Wang, C., and Yang, H. Deep operator learning lessens the curse of dimensionality for PDEs. arXiv:2301.12227 (2023)

work page arXiv 2023
[15]

R., Scheinberg, K., and Vicente, L

Conn, A. R., Scheinberg, K., and Vicente, L. N. Introduction to Derivative-Free Optimiza- tion. Society for Industrial and Applied Mathematics, 2009

work page 2009
[16]

ANN-based modeling of third order runge kutta method

Dehghanpour, M., Rahati, A., and Dehghanian, E. ANN-based modeling of third order runge kutta method. Journal of Advanced Computer Science & Technology 4 , 1 (2015), 180–189

work page 2015
[17]

Recent advances in deep learning for speech research at Microsoft

Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y., and Acero, A. Recent advances in deep learning for speech research at Microsoft. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013), pp. 8604–8608

work page 2013
[18]

Numerische Mathematik 2 , revised ed

Deuflhard, P., and Bornemann, F. Numerische Mathematik 2 , revised ed. de Gruyter Lehrbuch. Walter de Gruyter & Co., Berlin, 2008. Gew¨ ohnliche Differentialgleichungen. 34

work page 2008
[19]

S., and Ray, D.Controlling oscillations in high-order Discontin- uous Galerkin schemes using artificial viscosity tuned by neural networks

Discacciati, N., Hesthaven, J. S., and Ray, D.Controlling oscillations in high-order Discontin- uous Galerkin schemes using artificial viscosity tuned by neural networks. Journal of Computational Physics 409 (2020), 109304

work page 2020
[20]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations (2021)

work page 2021
[21]

A., Brenner, M

Dresdner, G., Kochkov, D., Norgaard, P., Zepeda-N ´u˜nez, L., Smith, J. A., Brenner, M. P., and Hoyer, S. Learning to correct spectral methods for simulating turbulent flows. arXiv:2207.00556 (2022)

work page arXiv 2022
[22]

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

E, W., Han, J., and Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communica- tions in Mathematics and Statistics 5 , 4 (2017), 349–380

work page 2017
[23]

Algorithms for Solving High Dimensional PDEs: From Non- linear Monte Carlo to Machine Learning

E, W., Han, J., and Jentzen, A. Algorithms for Solving High Dimensional PDEs: From Non- linear Monte Carlo to Machine Learning. Nonlinearity 35 (2022) 278-310 (2020)

work page 2022
[24]

E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T. On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high- dimensional nonlinear backward stochastic differential equations. J. Sci. Comput. 79 , 3 (2019), 1534–1571

work page 2019
[25]

Partial Differ

E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T.Multilevel Picard iterations for solving smooth semilinear parabolic heat equations. Partial Differ. Equ. Appl. 2 , 6 (2021), 80

work page 2021
[26]

One-parameter semigroups for linear evolution equations , vol

Engel, K.-J., and Nagel, R. One-parameter semigroups for linear evolution equations , vol. 194 of Graduate Texts in Mathematics . Springer-Verlag, New York, 2000. With contributions by S. Brendle, M. Campiti, T. Hahn, G. Metafune, G. Nickel, D. Pallara, C. Perazzoli, A. Rhandi, S. Romanelli and R. Schnaubelt

work page 2000
[27]

J., and Chen, G

Fidkowski, K. J., and Chen, G. Metric-based, goal-oriented mesh adaptation using machine learning. Journal of Computational Physics 426 (2021), 109957

work page 2021
[28]

A Posteriori Learning for Quasi-Geostrophic Turbulence Parametrization.Journal of Advances in Modeling Earth Systems 14, 11 (Nov

Frezat, H., Le Sommer, J., Fablet, R., Balarac, G., and Lguensat, R. A Posteriori Learning for Quasi-Geostrophic Turbulence Parametrization.Journal of Advances in Modeling Earth Systems 14, 11 (Nov. 2022)

work page 2022
[29]

arXiv:2101.08068 (2021)

Germain, M., Pham, H., and Warin, X.Neural networks-based algorithms for stochastic control and PDEs in finance. arXiv:2101.08068 (2021)

work page arXiv 2021
[30]

Learning to optimize multigrid PDE solvers

Greenfeld, D., Galun, M., Basri, R., Yavneh, I., and Kimmel, R. Learning to optimize multigrid PDE solvers. In Proceedings of the 36th International Conference on Machine Learning (09–15 Jun 2019), K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97 of Proceedings of Machine Learning Research, PMLR, pp. 2415–2423

work page 2019
[31]

Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces

Grohs, P., and Voigtlaender, F. Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces. Foundations of Compu- tational Mathematics (Jul 2023)

work page 2023
[32]

Convolutional Neural Networks for Steady Flow Approxima- tion

Guo, X., Li, W., and Iorio, F. Convolutional Neural Networks for Steady Flow Approxima- tion. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA, 2016), KDD ’16, Association for Computing Machinery, p. 481–490

work page 2016
[33]

Solving ordinary differential equations

Hairer, E., and Wanner, G. Solving ordinary differential equations. II , revised ed., vol. 14 of Springer Series in Computational Mathematics . Springer-Verlag, Berlin, 2010. Stiff and differential- algebraic problems. 35

work page 2010
[34]

Solving high-dimensional partial differential equations using deep learning

Han, J., Jentzen, A., and E, W. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences 115 , 34 (2018), 8505–8510

work page 2018
[35]

scikit- optimize/scikit-optimize, 2021

Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., and Shcherbatyi, I. scikit- optimize/scikit-optimize, 2021

work page 2021
[36]

The randomized information complexity of elliptic PDE

Heinrich, S. The randomized information complexity of elliptic PDE. Journal of Complexity 22 , 2 (2006), 220–249

work page 2006
[37]

Monte Carlo Complexity of Parametric Integration

Heinrich, S., and Sindambiwe, E. Monte Carlo Complexity of Parametric Integration. Journal of Complexity 15 , 3 (1999), 317–341

work page 1999
[38]

Multilevel CNNs for parametric PDEs

Heiß, C., G¨uhring, I., and Eigel, M. Multilevel CNNs for parametric PDEs. arXiv:2304.00388 (2023)

work page arXiv 2023
[39]

Counterparty Risk Valuation: A Marked Branching Diffusion Approach

Henry-Labordere, P. Counterparty Risk Valuation: A Marked Branching Diffusion Approach. arXiv:1203.2369 (2012)

work page internal anchor Pith review Pith/arXiv arXiv 2012
[40]

Branching diffusion representation of semilinear pdes and monte carlo approximation

Henry-Labordere, P., Oudjane, N., Tan, X., Touzi, N., Warin, X., et al. Branching diffusion representation of semilinear pdes and monte carlo approximation. In Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques(2019), vol. 55, Institut Henri Poincar´ e, pp. 184–210

work page 2019
[41]

E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T

Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., and Kingsbury, B. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine 29 , 6 (2012), 82–97

work page 2012
[42]

Explicit exponential Runge-Kutta methods for semilinear parabolic problems

Hochbruck, M., and Ostermann, A. Explicit exponential Runge-Kutta methods for semilinear parabolic problems. SIAM J. Numer. Anal. 43 , 3 (2005), 1069–1090

work page 2005
[43]

Learning Neural PDE Solvers with Convergence Guarantees

Hsieh, J.-T., Zhao, S., Eismann, S., Mirabella, L., and Ermon, S. Learning Neural PDE Solvers with Convergence Guarantees. arXiv:1906.01200 (2019)

work page arXiv 1906
[44]

Learning Optimal Multigrid Smoothers via Neural Networks

Huang, R., Li, R., and Xi, Y. Learning Optimal Multigrid Smoothers via Neural Networks. SIAM Journal on Scientific Computing 45 , 3 (2023), S199–S225

work page 2023
[45]

On fast simulation of dynamical system with neural vector enhanced numerical solver

Huang, Z., Liang, S., Zhang, H., Yang, H., and Lin, L. On fast simulation of dynamical system with neural vector enhanced numerical solver. Scientific Reports 13, 1 (Sep 2023), 15254

work page 2023
[46]

A., and von Wurstemberger, P

Hutzenthaler, M., Jentzen, A., Kruse, T., Nguyen, T. A., and von Wurstemberger, P. Overcoming the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations. Proc. A. 476, 2244 (2020), 20190630, 25

work page 2020
[47]

Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory

Jentzen, A., Kuckuck, B., and von Wurstemberger, P. Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory. arXiv:2310.20360 (2023)

work page arXiv 2023
[48]

S., and S ¨uli, E

Jovanovi´c, B. S., and S ¨uli, E. Analysis of Finite Difference Schemes . Springer London, 2014

work page 2014
[49]

E., Kevrekidis, I

Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., and Yang, L. Physics-informed machine learning. Nature Reviews Physics 3 , 6 (2021), 422–440

work page 2021
[50]

Deep Multigrid: learning prolongation and restriction matrices

Katrutsa, A., Daulbaev, T., and Oseledets, I. Deep Multigrid: learning prolongation and restriction matrices. arXiv:1711.03825 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[51]

Solving parametric PDE problems with artificial neural networks

Khoo, Y., Lu, J., and Ying, L. Solving parametric PDE problems with artificial neural networks. European J. Appl. Math. 32 , 3 (2021), 421–435

work page 2021
[52]

A., Alieva, A., Wang, Q., Brenner, M

Kochkov, D., Smith, J. A., Alieva, A., Wang, Q., Brenner, M. P., and Hoyer, S. Machine learning-accelerated computational fluid dynamics. Proc. Natl. Acad. Sci. USA 118 , 21 (2021), Paper No. e2101784118, 8. 36

work page 2021
[53]

Deep FDM: Enhanced finite difference methods by deep learning

Kossaczk´a, T., Ehrhardt, M., and G ¨unther, M. Deep FDM: Enhanced finite difference methods by deep learning. Franklin Open 4 (2023), 100039

work page 2023
[54]

On universal approximation and error bounds for Fourier neural operators

Kovachki, N., Lanthaler, S., and Mishra, S. On universal approximation and error bounds for Fourier neural operators. J. Mach. Learn. Res. 22 (2021), Paper No. [290], 76

work page 2021
[55]

Krizhevsky, A., Sutskever, I., and Hinton, G. E. ImageNet Classification with Deep Convo- lutional Neural Networks. In Advances in Neural Information Processing Systems (2012), F. Pereira, C. Burges, L. Bottou, and K. Weinberger, Eds., vol. 25, Curran Associates, Inc

work page 2012
[56]

Nonlinear reconstruction for operator learning of PDEs with discontinuities

Lanthaler, S., Molinaro, R., Hadorn, P., and Mishra, S. Nonlinear reconstruction for operator learning of PDEs with discontinuities. arXiv:2210.01074 (2022)

work page arXiv 2022
[57]

LeVeque, R. J. Finite Difference Methods for Ordinary and Partial Differential Equations. Society for Industrial and Applied Mathematics, 2007

work page 2007
[58]

Z., Liu, B., and Anandkumar, A

Li, Z., Huang, D. Z., Liu, B., and Anandkumar, A. Fourier neural operator with learned deformations for PDEs on general geometries. arXiv:2207.05209 (2022)

work page arXiv 2022
[59]

Neural Operator: Graph Kernel Network for Partial Differential Equations

Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Graph kernel network for partial differential equations. arXiv:2003.03485 (2020)

work page internal anchor Pith review Pith/arXiv arXiv 2003
[60]

Fourier neural operator for parametric partial differential equations

Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Fourier neural operator for parametric partial differential equations. In Inter- national Conference on Learning Representations (2021)

work page 2021
[61]

Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons

List, B., Chen, L.-W., and Thuerey, N. Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons. Journal of Fluid Mechanics 949 (2022), A25

work page 2022
[62]

N., and Brunton, S

Liu, Y., Kutz, J. N., and Brunton, S. L. Hierarchical deep learning of multiscale differential equation time-steppers. Philos. Trans. Roy. Soc. A 380 , 2229 (2022), Paper No. 20210200, 17

work page 2022
[63]

Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3 , 3 (2021), 218–229

work page 2021
[64]

Lu, L., Meng, X., Cai, S., Mao, Z., Goswami, S., Zhang, Z., and Karniadakis, G. E. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Computer Methods in Applied Mechanics and Engineering 393 (2022), 114778

work page 2022
[65]

Subgrid modelling for two-dimensional turbulence using neural networks

Maulik, R., San, O., Rasheed, A., and Vedula, P. Subgrid modelling for two-dimensional turbulence using neural networks. Journal of Fluid Mechanics 858 (2019), 122–144

work page 2019
[66]

A review of deep learning techniques for speech processing

Mehrish, A., Majumder, N., Bhardwaj, R., Mihalcea, R., and Poria, S. A review of deep learning techniques for speech processing. arXiv:2305.00359 (2023)

work page arXiv 2023
[67]

A machine learning framework for data driven acceleration of computations of differ- ential equations

Mishra, S. A machine learning framework for data driven acceleration of computations of differ- ential equations. Math. Eng. 1 , 1 (2019), 118–146

work page 2019
[68]

H., and Stuart, A

Nelsen, N. H., and Stuart, A. M. The random feature model for input-output maps between Banach spaces. SIAM J. Sci. Comput. 43 , 5 (2021), A3212–A3243

work page 2021
[69]

Y., Penent, G., and Privault, N

Nguwi, J. Y., Penent, G., and Privault, N. A fully nonlinear Feynman-Kac formula with derivatives of arbitrary orders. arXiv:2201.03882 (2022)

work page arXiv 2022
[70]

Tractability of Multivariate Problems: Standard information for functionals, vol

Novak, E., and Wo´zniakowski, H. Tractability of Multivariate Problems: Standard information for functionals, vol. 12. European Mathematical Society, 2008. 37

work page 2008
[71]

Tractability of multivariate problems

Novak, E., and Wo ´zniakowski, H. Tractability of multivariate problems. Vol. 1: Linear in- formation, vol. 6 of EMS Tracts in Mathematics . European Mathematical Society (EMS), Z¨ urich, 2008

work page 2008
[72]

GPT-4 Technical Report

OpenAI. GPT-4 Technical Report. arXiv:2303.08774 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[73]

Learning Runge-Kutta integration schemes for ODE simulation and identification

Ouala, S., Debreu, L., Pascual, A., Chapron, B., Collard, F., Gaultier, L., and Fablet, R. Learning Runge-Kutta integration schemes for ODE simulation and identification. arXiv:2105.04999 (2021)

work page arXiv 2021
[74]

Semigroups of linear operators and applications to partial differential equations , vol

Pazy, A. Semigroups of linear operators and applications to partial differential equations , vol. 44 of Applied Mathematical Sciences. Springer-Verlag, New York, 1983

work page 1983
[75]

Mean-field neural networks: learning mappings on Wasserstein space

Pham, H., and Warin, X. Mean-field neural networks: learning mappings on Wasserstein space. arXiv:2210.15179 (2022)

work page arXiv 2022
[76]

Pre-trained models for natural language processing: A survey

Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., and Huang, X. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63, 10 (Sept. 2020), 1872–1897

work page 2020
[77]

Raissi, M., Perdikaris, P., and Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378 (2019), 686–707

work page 2019
[78]

Convolutional Neural Operators

Raoni´c, B., Molinaro, R., Rohner, T., Mishra, S., and de Bezenac, E. Convolutional Neural Operators. arXiv:2302.01178 (2023)

work page arXiv 2023
[79]

Ray, D., and Hesthaven, J. S. An artificial neural network as a troubled-cell indicator. Journal of Computational Physics 367 (2018), 166–191

work page 2018
[80]

A primer in BERTology: What we know about how BERT works

Rogers, A., Kovaleva, O., and Rumshisky, A. A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics 8 (2020), 842–866

work page 2020

Showing first 80 references.

[1] [1]

Anastassi, A. A. Constructing Runge–Kutta methods with the use of artificial neural networks. Neural Computing and Applications 25 , 1 (2014), 229–236

work page 2014

[2] [2]

Practical Optimization: Algorithms and Engineering Applications

Antoniou, A., and Lu, W.-S. Practical Optimization: Algorithms and Engineering Applications . Springer US, 2021. 33

work page 2021

[3] [3]

Bar-Sinai, Y., Hoyer, S., Hickey, J., and Brenner, M. P. Learning data-driven discretiza- tions for partial differential equations. Proc. Natl. Acad. Sci. USA 116 , 31 (2019), 15344–15349

work page 2019

[4] [4]

Numerical Methods for Nonlinear Partial Differential Equations

Bartels, S. Numerical Methods for Nonlinear Partial Differential Equations . Springer Interna- tional Publishing, 2015

work page 2015

[5] [5]

An overview on deep learning- based approximation methods for partial differential equations

Beck, C., Hutzenthaler, M., Jentzen, A., and Kuckuck, B. An overview on deep learning- based approximation methods for partial differential equations. Discrete Contin. Dyn. Syst. Ser. B (2022) (2020)

work page 2022

[6] [6]

S., and von Wurstemberger, P

Becker, S., Jentzen, A., M ¨uller, M. S., and von Wurstemberger, P. Learning the random variables in Monte Carlo simulations with stochastic gradient descent: machine learning for parametric PDEs and financial derivative pricing. Math. Finance 34 , 1 (2024), 90–150

work page 2024

[7] [7]

Dynamic programming

Bellman, R. Dynamic programming. Science 153, 3731 (1966), 34–37

work page 1966

[8] [8]

Blechschmidt, J., and Ernst, O. G. Three Ways to Solve Partial Differential Equations with Neural Networks–A Review. arXiv:2102.11802 (2021)

work page arXiv 2021

[9] [9]

Recurrent neural networks as optimal mesh refinement strategies

Bohn, J., and Feischl, M. Recurrent neural networks as optimal mesh refinement strategies. Computers & Mathematics with Applications 97 (2021), 61–76

work page 2021

[10] [10]

Brandstetter, J., Berg, R. v. d., Welling, M., and Gupta, J. K. Clifford neural layers for PDE modeling. arXiv:2209.04934 (2022)

work page arXiv 2022

[11] [11]

Brevis, I., Muga, I., and van der Zee, K. G. A machine-learning minimal-residual (ML-MRes) framework for goal-oriented finite element discretizations. Computers & Mathematics with Appli- cations 95 (2021), 186–199. Recent Advances in Least-Squares and Discontinuous Petrov–Galerkin Finite Element Methods

work page 2021

[12] [12]

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakan- tan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., Mc- Candlish, S., Radfo...

work page internal anchor Pith review Pith/arXiv arXiv 2005

[13] [13]

An introduction to semilinear evolution equations , vol

Cazenave, T., and Haraux, A. An introduction to semilinear evolution equations , vol. 13 of Oxford Lecture Series in Mathematics and its Applications. The Clarendon Press, Oxford University Press, New York, 1998. Translated from the 1990 French original by Yvan Martel and revised by the authors

work page 1998

[14] [14]

Deep operator learning lessens the curse of dimensionality for PDEs

Chen, K., Wang, C., and Yang, H. Deep operator learning lessens the curse of dimensionality for PDEs. arXiv:2301.12227 (2023)

work page arXiv 2023

[15] [15]

R., Scheinberg, K., and Vicente, L

Conn, A. R., Scheinberg, K., and Vicente, L. N. Introduction to Derivative-Free Optimiza- tion. Society for Industrial and Applied Mathematics, 2009

work page 2009

[16] [16]

ANN-based modeling of third order runge kutta method

Dehghanpour, M., Rahati, A., and Dehghanian, E. ANN-based modeling of third order runge kutta method. Journal of Advanced Computer Science & Technology 4 , 1 (2015), 180–189

work page 2015

[17] [17]

Recent advances in deep learning for speech research at Microsoft

Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y., and Acero, A. Recent advances in deep learning for speech research at Microsoft. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013), pp. 8604–8608

work page 2013

[18] [18]

Numerische Mathematik 2 , revised ed

Deuflhard, P., and Bornemann, F. Numerische Mathematik 2 , revised ed. de Gruyter Lehrbuch. Walter de Gruyter & Co., Berlin, 2008. Gew¨ ohnliche Differentialgleichungen. 34

work page 2008

[19] [19]

S., and Ray, D.Controlling oscillations in high-order Discontin- uous Galerkin schemes using artificial viscosity tuned by neural networks

Discacciati, N., Hesthaven, J. S., and Ray, D.Controlling oscillations in high-order Discontin- uous Galerkin schemes using artificial viscosity tuned by neural networks. Journal of Computational Physics 409 (2020), 109304

work page 2020

[20] [20]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations (2021)

work page 2021

[21] [21]

A., Brenner, M

Dresdner, G., Kochkov, D., Norgaard, P., Zepeda-N ´u˜nez, L., Smith, J. A., Brenner, M. P., and Hoyer, S. Learning to correct spectral methods for simulating turbulent flows. arXiv:2207.00556 (2022)

work page arXiv 2022

[22] [22]

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

E, W., Han, J., and Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communica- tions in Mathematics and Statistics 5 , 4 (2017), 349–380

work page 2017

[23] [23]

Algorithms for Solving High Dimensional PDEs: From Non- linear Monte Carlo to Machine Learning

E, W., Han, J., and Jentzen, A. Algorithms for Solving High Dimensional PDEs: From Non- linear Monte Carlo to Machine Learning. Nonlinearity 35 (2022) 278-310 (2020)

work page 2022

[24] [24]

E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T. On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high- dimensional nonlinear backward stochastic differential equations. J. Sci. Comput. 79 , 3 (2019), 1534–1571

work page 2019

[25] [25]

Partial Differ

E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T.Multilevel Picard iterations for solving smooth semilinear parabolic heat equations. Partial Differ. Equ. Appl. 2 , 6 (2021), 80

work page 2021

[26] [26]

One-parameter semigroups for linear evolution equations , vol

Engel, K.-J., and Nagel, R. One-parameter semigroups for linear evolution equations , vol. 194 of Graduate Texts in Mathematics . Springer-Verlag, New York, 2000. With contributions by S. Brendle, M. Campiti, T. Hahn, G. Metafune, G. Nickel, D. Pallara, C. Perazzoli, A. Rhandi, S. Romanelli and R. Schnaubelt

work page 2000

[27] [27]

J., and Chen, G

Fidkowski, K. J., and Chen, G. Metric-based, goal-oriented mesh adaptation using machine learning. Journal of Computational Physics 426 (2021), 109957

work page 2021

[28] [28]

A Posteriori Learning for Quasi-Geostrophic Turbulence Parametrization.Journal of Advances in Modeling Earth Systems 14, 11 (Nov

Frezat, H., Le Sommer, J., Fablet, R., Balarac, G., and Lguensat, R. A Posteriori Learning for Quasi-Geostrophic Turbulence Parametrization.Journal of Advances in Modeling Earth Systems 14, 11 (Nov. 2022)

work page 2022

[29] [29]

arXiv:2101.08068 (2021)

Germain, M., Pham, H., and Warin, X.Neural networks-based algorithms for stochastic control and PDEs in finance. arXiv:2101.08068 (2021)

work page arXiv 2021

[30] [30]

Learning to optimize multigrid PDE solvers

Greenfeld, D., Galun, M., Basri, R., Yavneh, I., and Kimmel, R. Learning to optimize multigrid PDE solvers. In Proceedings of the 36th International Conference on Machine Learning (09–15 Jun 2019), K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97 of Proceedings of Machine Learning Research, PMLR, pp. 2415–2423

work page 2019

[31] [31]

Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces

Grohs, P., and Voigtlaender, F. Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces. Foundations of Compu- tational Mathematics (Jul 2023)

work page 2023

[32] [32]

Convolutional Neural Networks for Steady Flow Approxima- tion

Guo, X., Li, W., and Iorio, F. Convolutional Neural Networks for Steady Flow Approxima- tion. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA, 2016), KDD ’16, Association for Computing Machinery, p. 481–490

work page 2016

[33] [33]

Solving ordinary differential equations

Hairer, E., and Wanner, G. Solving ordinary differential equations. II , revised ed., vol. 14 of Springer Series in Computational Mathematics . Springer-Verlag, Berlin, 2010. Stiff and differential- algebraic problems. 35

work page 2010

[34] [34]

Solving high-dimensional partial differential equations using deep learning

Han, J., Jentzen, A., and E, W. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences 115 , 34 (2018), 8505–8510

work page 2018

[35] [35]

scikit- optimize/scikit-optimize, 2021

Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., and Shcherbatyi, I. scikit- optimize/scikit-optimize, 2021

work page 2021

[36] [36]

The randomized information complexity of elliptic PDE

Heinrich, S. The randomized information complexity of elliptic PDE. Journal of Complexity 22 , 2 (2006), 220–249

work page 2006

[37] [37]

Monte Carlo Complexity of Parametric Integration

Heinrich, S., and Sindambiwe, E. Monte Carlo Complexity of Parametric Integration. Journal of Complexity 15 , 3 (1999), 317–341

work page 1999

[38] [38]

Multilevel CNNs for parametric PDEs

Heiß, C., G¨uhring, I., and Eigel, M. Multilevel CNNs for parametric PDEs. arXiv:2304.00388 (2023)

work page arXiv 2023

[39] [39]

Counterparty Risk Valuation: A Marked Branching Diffusion Approach

Henry-Labordere, P. Counterparty Risk Valuation: A Marked Branching Diffusion Approach. arXiv:1203.2369 (2012)

work page internal anchor Pith review Pith/arXiv arXiv 2012

[40] [40]

Branching diffusion representation of semilinear pdes and monte carlo approximation

Henry-Labordere, P., Oudjane, N., Tan, X., Touzi, N., Warin, X., et al. Branching diffusion representation of semilinear pdes and monte carlo approximation. In Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques(2019), vol. 55, Institut Henri Poincar´ e, pp. 184–210

work page 2019

[41] [41]

E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T

Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., and Kingsbury, B. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine 29 , 6 (2012), 82–97

work page 2012

[42] [42]

Explicit exponential Runge-Kutta methods for semilinear parabolic problems

Hochbruck, M., and Ostermann, A. Explicit exponential Runge-Kutta methods for semilinear parabolic problems. SIAM J. Numer. Anal. 43 , 3 (2005), 1069–1090

work page 2005

[43] [43]

Learning Neural PDE Solvers with Convergence Guarantees

Hsieh, J.-T., Zhao, S., Eismann, S., Mirabella, L., and Ermon, S. Learning Neural PDE Solvers with Convergence Guarantees. arXiv:1906.01200 (2019)

work page arXiv 1906

[44] [44]

Learning Optimal Multigrid Smoothers via Neural Networks

Huang, R., Li, R., and Xi, Y. Learning Optimal Multigrid Smoothers via Neural Networks. SIAM Journal on Scientific Computing 45 , 3 (2023), S199–S225

work page 2023

[45] [45]

On fast simulation of dynamical system with neural vector enhanced numerical solver

Huang, Z., Liang, S., Zhang, H., Yang, H., and Lin, L. On fast simulation of dynamical system with neural vector enhanced numerical solver. Scientific Reports 13, 1 (Sep 2023), 15254

work page 2023

[46] [46]

A., and von Wurstemberger, P

Hutzenthaler, M., Jentzen, A., Kruse, T., Nguyen, T. A., and von Wurstemberger, P. Overcoming the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations. Proc. A. 476, 2244 (2020), 20190630, 25

work page 2020

[47] [47]

Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory

Jentzen, A., Kuckuck, B., and von Wurstemberger, P. Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory. arXiv:2310.20360 (2023)

work page arXiv 2023

[48] [48]

S., and S ¨uli, E

Jovanovi´c, B. S., and S ¨uli, E. Analysis of Finite Difference Schemes . Springer London, 2014

work page 2014

[49] [49]

E., Kevrekidis, I

Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., and Yang, L. Physics-informed machine learning. Nature Reviews Physics 3 , 6 (2021), 422–440

work page 2021

[50] [50]

Deep Multigrid: learning prolongation and restriction matrices

Katrutsa, A., Daulbaev, T., and Oseledets, I. Deep Multigrid: learning prolongation and restriction matrices. arXiv:1711.03825 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[51] [51]

Solving parametric PDE problems with artificial neural networks

Khoo, Y., Lu, J., and Ying, L. Solving parametric PDE problems with artificial neural networks. European J. Appl. Math. 32 , 3 (2021), 421–435

work page 2021

[52] [52]

A., Alieva, A., Wang, Q., Brenner, M

Kochkov, D., Smith, J. A., Alieva, A., Wang, Q., Brenner, M. P., and Hoyer, S. Machine learning-accelerated computational fluid dynamics. Proc. Natl. Acad. Sci. USA 118 , 21 (2021), Paper No. e2101784118, 8. 36

work page 2021

[53] [53]

Deep FDM: Enhanced finite difference methods by deep learning

Kossaczk´a, T., Ehrhardt, M., and G ¨unther, M. Deep FDM: Enhanced finite difference methods by deep learning. Franklin Open 4 (2023), 100039

work page 2023

[54] [54]

On universal approximation and error bounds for Fourier neural operators

Kovachki, N., Lanthaler, S., and Mishra, S. On universal approximation and error bounds for Fourier neural operators. J. Mach. Learn. Res. 22 (2021), Paper No. [290], 76

work page 2021

[55] [55]

Krizhevsky, A., Sutskever, I., and Hinton, G. E. ImageNet Classification with Deep Convo- lutional Neural Networks. In Advances in Neural Information Processing Systems (2012), F. Pereira, C. Burges, L. Bottou, and K. Weinberger, Eds., vol. 25, Curran Associates, Inc

work page 2012

[56] [56]

Nonlinear reconstruction for operator learning of PDEs with discontinuities

Lanthaler, S., Molinaro, R., Hadorn, P., and Mishra, S. Nonlinear reconstruction for operator learning of PDEs with discontinuities. arXiv:2210.01074 (2022)

work page arXiv 2022

[57] [57]

LeVeque, R. J. Finite Difference Methods for Ordinary and Partial Differential Equations. Society for Industrial and Applied Mathematics, 2007

work page 2007

[58] [58]

Z., Liu, B., and Anandkumar, A

Li, Z., Huang, D. Z., Liu, B., and Anandkumar, A. Fourier neural operator with learned deformations for PDEs on general geometries. arXiv:2207.05209 (2022)

work page arXiv 2022

[59] [59]

Neural Operator: Graph Kernel Network for Partial Differential Equations

Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Graph kernel network for partial differential equations. arXiv:2003.03485 (2020)

work page internal anchor Pith review Pith/arXiv arXiv 2003

[60] [60]

Fourier neural operator for parametric partial differential equations

Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Fourier neural operator for parametric partial differential equations. In Inter- national Conference on Learning Representations (2021)

work page 2021

[61] [61]

Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons

List, B., Chen, L.-W., and Thuerey, N. Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons. Journal of Fluid Mechanics 949 (2022), A25

work page 2022

[62] [62]

N., and Brunton, S

Liu, Y., Kutz, J. N., and Brunton, S. L. Hierarchical deep learning of multiscale differential equation time-steppers. Philos. Trans. Roy. Soc. A 380 , 2229 (2022), Paper No. 20210200, 17

work page 2022

[63] [63]

Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3 , 3 (2021), 218–229

work page 2021

[64] [64]

Lu, L., Meng, X., Cai, S., Mao, Z., Goswami, S., Zhang, Z., and Karniadakis, G. E. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Computer Methods in Applied Mechanics and Engineering 393 (2022), 114778

work page 2022

[65] [65]

Subgrid modelling for two-dimensional turbulence using neural networks

Maulik, R., San, O., Rasheed, A., and Vedula, P. Subgrid modelling for two-dimensional turbulence using neural networks. Journal of Fluid Mechanics 858 (2019), 122–144

work page 2019

[66] [66]

A review of deep learning techniques for speech processing

Mehrish, A., Majumder, N., Bhardwaj, R., Mihalcea, R., and Poria, S. A review of deep learning techniques for speech processing. arXiv:2305.00359 (2023)

work page arXiv 2023

[67] [67]

A machine learning framework for data driven acceleration of computations of differ- ential equations

Mishra, S. A machine learning framework for data driven acceleration of computations of differ- ential equations. Math. Eng. 1 , 1 (2019), 118–146

work page 2019

[68] [68]

H., and Stuart, A

Nelsen, N. H., and Stuart, A. M. The random feature model for input-output maps between Banach spaces. SIAM J. Sci. Comput. 43 , 5 (2021), A3212–A3243

work page 2021

[69] [69]

Y., Penent, G., and Privault, N

Nguwi, J. Y., Penent, G., and Privault, N. A fully nonlinear Feynman-Kac formula with derivatives of arbitrary orders. arXiv:2201.03882 (2022)

work page arXiv 2022

[70] [70]

Tractability of Multivariate Problems: Standard information for functionals, vol

Novak, E., and Wo´zniakowski, H. Tractability of Multivariate Problems: Standard information for functionals, vol. 12. European Mathematical Society, 2008. 37

work page 2008

[71] [71]

Tractability of multivariate problems

Novak, E., and Wo ´zniakowski, H. Tractability of multivariate problems. Vol. 1: Linear in- formation, vol. 6 of EMS Tracts in Mathematics . European Mathematical Society (EMS), Z¨ urich, 2008

work page 2008

[72] [72]

GPT-4 Technical Report

OpenAI. GPT-4 Technical Report. arXiv:2303.08774 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[73] [73]

Learning Runge-Kutta integration schemes for ODE simulation and identification

Ouala, S., Debreu, L., Pascual, A., Chapron, B., Collard, F., Gaultier, L., and Fablet, R. Learning Runge-Kutta integration schemes for ODE simulation and identification. arXiv:2105.04999 (2021)

work page arXiv 2021

[74] [74]

Semigroups of linear operators and applications to partial differential equations , vol

Pazy, A. Semigroups of linear operators and applications to partial differential equations , vol. 44 of Applied Mathematical Sciences. Springer-Verlag, New York, 1983

work page 1983

[75] [75]

Mean-field neural networks: learning mappings on Wasserstein space

Pham, H., and Warin, X. Mean-field neural networks: learning mappings on Wasserstein space. arXiv:2210.15179 (2022)

work page arXiv 2022

[76] [76]

Pre-trained models for natural language processing: A survey

Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., and Huang, X. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63, 10 (Sept. 2020), 1872–1897

work page 2020

[77] [77]

Raissi, M., Perdikaris, P., and Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378 (2019), 686–707

work page 2019

[78] [78]

Convolutional Neural Operators

Raoni´c, B., Molinaro, R., Rohner, T., Mishra, S., and de Bezenac, E. Convolutional Neural Operators. arXiv:2302.01178 (2023)

work page arXiv 2023

[79] [79]

Ray, D., and Hesthaven, J. S. An artificial neural network as a troubled-cell indicator. Journal of Computational Physics 367 (2018), 166–191

work page 2018

[80] [80]

A primer in BERTology: What we know about how BERT works

Rogers, A., Kovaleva, O., and Rumshisky, A. A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics 8 (2020), 842–866

work page 2020