Algorithmically Designed Artificial Neural Networks (ADANNs): Higher order deep operator learning for parametric partial differential equations
Pith reviewed 2026-05-24 09:34 UTC · model grok-4.3
The pith
ADANNs design neural network architectures and initializations to mimic classical numerical algorithms for better approximation of parametric PDE operators.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that customized ANN architectures together with specialized initializations that make the networks closely mimic efficient classical numerical algorithms at the outset produce ADANNs whose approximation quality for operators associated with parametric PDEs exceeds that of existing classical algorithms and existing deep operator learning approaches in the numerical examples considered.
What carries the argument
ADANNs are customized artificial neural network architectures equipped with initialization schemes that make the networks replicate a selected classical numerical algorithm for the target PDE operator approximation problem at the start of training.
If this is right
- The ADANN construction can be applied to additional parametric PDEs to obtain higher-order operator approximations.
- Specialized initializations derived from classical algorithms reduce the distance the network must travel during training.
- The method consistently outperforms both pure classical schemes and prior deep operator learning techniques in the reported tests.
- The same design principle of algorithmic mimicry at initialization extends to other operator-learning settings for differential equations.
Where Pith is reading between the lines
- The same initialization-by-mimicry idea could be tested on time-dependent or nonlinear PDE families not covered in the paper.
- Automatic procedures that translate a classical solver into an ANN architecture might reduce the manual design effort required.
- Because the networks begin closer to the solution manifold, they may require smaller training sets than purely data-driven operator learners.
- The approach suggests a general route for injecting domain knowledge from numerical analysis directly into the starting weights of deep networks.
Load-bearing premise
That adapting ANN architectures and initializations to start by copying classical numerical algorithms will combine effectively with deep operator learning to deliver better accuracy on the parametric PDE problems.
What would settle it
A side-by-side numerical test on one of the parametric PDE examples in which the ADANN method produces equal or higher approximation error than a standard deep operator network or a classical solver after comparable training would falsify the performance advantage.
Figures
read the original abstract
In this article we propose a new deep learning approach to approximate operators related to parametric partial differential equations (PDEs). In particular, we introduce a new strategy to design specific artificial neural network (ANN) architectures in conjunction with specific ANN initialization schemes which are tailor-made for the particular approximation problem under consideration. In the proposed approach we combine efficient classical numerical approximation techniques with deep operator learning methodologies. Specifically, we introduce customized adaptions of existing ANN architectures together with specialized initializations for these ANN architectures so that at initialization we have that the ANNs closely mimic a chosen efficient classical numerical algorithm for the considered approximation problem. The obtained ANN architectures and their initialization schemes are thus strongly inspired by numerical algorithms as well as by popular deep learning methodologies from the literature and in that sense we refer to the introduced ANNs in conjunction with their tailor-made initialization schemes as Algorithmically Designed Artificial Neural Networks (ADANNs). We numerically test the proposed ADANN methodology in the case of several parametric PDEs. In the tested numerical examples the ADANN methodology significantly outperforms existing classical approximation algorithms as well as existing deep operator learning methodologies from the literature.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Algorithmically Designed Artificial Neural Networks (ADANNs) for approximating operators associated with parametric PDEs. The core idea is to construct customized ANN architectures and initialization schemes that are strongly inspired by efficient classical numerical algorithms, so that the network at initialization closely mimics a chosen classical solver, and then to combine this with deep operator learning techniques. Numerical tests on several parametric PDE examples are reported to show that the resulting ADANNs significantly outperform both classical approximation algorithms and prior deep operator learning methods from the literature.
Significance. If the reported empirical gains hold under scrutiny, the work offers a concrete mechanism for injecting classical numerical knowledge into the initialization and architecture of operator networks. This could improve training stability and final accuracy for parametric PDE problems where good classical discretizations already exist. The explicit construction of initializations that reproduce classical schemes is a methodological strength that distinguishes the approach from generic operator-learning architectures.
minor comments (2)
- The abstract states that ADANNs 'significantly outperform' both classical and existing deep-learning methods, yet provides no quantitative error tables, baseline descriptions, or statistical details. The full experimental section should include these to allow readers to assess the magnitude and robustness of the claimed gains.
- Notation for the operator-learning setting (input/output function spaces, parameter domain, etc.) should be introduced once in a dedicated preliminary section rather than scattered across the methodology and examples.
Simulated Author's Rebuttal
We thank the referee for the careful reading and positive assessment of our manuscript, including the accurate summary of the ADANN approach and its empirical performance. We appreciate the recommendation for minor revision and the recognition of the methodological contribution in embedding classical numerical schemes into network architectures and initializations.
Circularity Check
No circularity: empirical proposal with independent numerical validation
full rationale
The paper proposes a methodological combination of classical numerical algorithms with deep operator networks via custom ANN architectures and initializations that mimic the classical solver at start. Central claims rest on reported numerical experiments comparing ADANN performance against both classical methods and prior deep operator approaches on parametric PDE test cases. No derivation chain, equation, or claim reduces by construction to a fitted input, self-citation, or self-definition; the approach is explicitly described as an engineering synthesis whose value is assessed externally via benchmarks. This matches the default expectation of a non-circular empirical methods paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Existence of efficient classical numerical algorithms for the parametric PDE operator approximation problems under consideration
invented entities (1)
-
ADANNs
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Anastassi, A. A. Constructing Runge–Kutta methods with the use of artificial neural networks. Neural Computing and Applications 25 , 1 (2014), 229–236
work page 2014
-
[2]
Practical Optimization: Algorithms and Engineering Applications
Antoniou, A., and Lu, W.-S. Practical Optimization: Algorithms and Engineering Applications . Springer US, 2021. 33
work page 2021
-
[3]
Bar-Sinai, Y., Hoyer, S., Hickey, J., and Brenner, M. P. Learning data-driven discretiza- tions for partial differential equations. Proc. Natl. Acad. Sci. USA 116 , 31 (2019), 15344–15349
work page 2019
-
[4]
Numerical Methods for Nonlinear Partial Differential Equations
Bartels, S. Numerical Methods for Nonlinear Partial Differential Equations . Springer Interna- tional Publishing, 2015
work page 2015
-
[5]
An overview on deep learning- based approximation methods for partial differential equations
Beck, C., Hutzenthaler, M., Jentzen, A., and Kuckuck, B. An overview on deep learning- based approximation methods for partial differential equations. Discrete Contin. Dyn. Syst. Ser. B (2022) (2020)
work page 2022
-
[6]
Becker, S., Jentzen, A., M ¨uller, M. S., and von Wurstemberger, P. Learning the random variables in Monte Carlo simulations with stochastic gradient descent: machine learning for parametric PDEs and financial derivative pricing. Math. Finance 34 , 1 (2024), 90–150
work page 2024
- [7]
- [8]
-
[9]
Recurrent neural networks as optimal mesh refinement strategies
Bohn, J., and Feischl, M. Recurrent neural networks as optimal mesh refinement strategies. Computers & Mathematics with Applications 97 (2021), 61–76
work page 2021
- [10]
-
[11]
Brevis, I., Muga, I., and van der Zee, K. G. A machine-learning minimal-residual (ML-MRes) framework for goal-oriented finite element discretizations. Computers & Mathematics with Appli- cations 95 (2021), 186–199. Recent Advances in Least-Squares and Discontinuous Petrov–Galerkin Finite Element Methods
work page 2021
-
[12]
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakan- tan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., Mc- Candlish, S., Radfo...
work page internal anchor Pith review Pith/arXiv arXiv 2005
-
[13]
An introduction to semilinear evolution equations , vol
Cazenave, T., and Haraux, A. An introduction to semilinear evolution equations , vol. 13 of Oxford Lecture Series in Mathematics and its Applications. The Clarendon Press, Oxford University Press, New York, 1998. Translated from the 1990 French original by Yvan Martel and revised by the authors
work page 1998
-
[14]
Deep operator learning lessens the curse of dimensionality for PDEs
Chen, K., Wang, C., and Yang, H. Deep operator learning lessens the curse of dimensionality for PDEs. arXiv:2301.12227 (2023)
-
[15]
R., Scheinberg, K., and Vicente, L
Conn, A. R., Scheinberg, K., and Vicente, L. N. Introduction to Derivative-Free Optimiza- tion. Society for Industrial and Applied Mathematics, 2009
work page 2009
-
[16]
ANN-based modeling of third order runge kutta method
Dehghanpour, M., Rahati, A., and Dehghanian, E. ANN-based modeling of third order runge kutta method. Journal of Advanced Computer Science & Technology 4 , 1 (2015), 180–189
work page 2015
-
[17]
Recent advances in deep learning for speech research at Microsoft
Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y., and Acero, A. Recent advances in deep learning for speech research at Microsoft. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013), pp. 8604–8608
work page 2013
-
[18]
Numerische Mathematik 2 , revised ed
Deuflhard, P., and Bornemann, F. Numerische Mathematik 2 , revised ed. de Gruyter Lehrbuch. Walter de Gruyter & Co., Berlin, 2008. Gew¨ ohnliche Differentialgleichungen. 34
work page 2008
-
[19]
Discacciati, N., Hesthaven, J. S., and Ray, D.Controlling oscillations in high-order Discontin- uous Galerkin schemes using artificial viscosity tuned by neural networks. Journal of Computational Physics 409 (2020), 109304
work page 2020
-
[20]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations (2021)
work page 2021
-
[21]
Dresdner, G., Kochkov, D., Norgaard, P., Zepeda-N ´u˜nez, L., Smith, J. A., Brenner, M. P., and Hoyer, S. Learning to correct spectral methods for simulating turbulent flows. arXiv:2207.00556 (2022)
-
[22]
E, W., Han, J., and Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communica- tions in Mathematics and Statistics 5 , 4 (2017), 349–380
work page 2017
-
[23]
Algorithms for Solving High Dimensional PDEs: From Non- linear Monte Carlo to Machine Learning
E, W., Han, J., and Jentzen, A. Algorithms for Solving High Dimensional PDEs: From Non- linear Monte Carlo to Machine Learning. Nonlinearity 35 (2022) 278-310 (2020)
work page 2022
-
[24]
E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T. On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high- dimensional nonlinear backward stochastic differential equations. J. Sci. Comput. 79 , 3 (2019), 1534–1571
work page 2019
-
[25]
E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T.Multilevel Picard iterations for solving smooth semilinear parabolic heat equations. Partial Differ. Equ. Appl. 2 , 6 (2021), 80
work page 2021
-
[26]
One-parameter semigroups for linear evolution equations , vol
Engel, K.-J., and Nagel, R. One-parameter semigroups for linear evolution equations , vol. 194 of Graduate Texts in Mathematics . Springer-Verlag, New York, 2000. With contributions by S. Brendle, M. Campiti, T. Hahn, G. Metafune, G. Nickel, D. Pallara, C. Perazzoli, A. Rhandi, S. Romanelli and R. Schnaubelt
work page 2000
-
[27]
Fidkowski, K. J., and Chen, G. Metric-based, goal-oriented mesh adaptation using machine learning. Journal of Computational Physics 426 (2021), 109957
work page 2021
-
[28]
Frezat, H., Le Sommer, J., Fablet, R., Balarac, G., and Lguensat, R. A Posteriori Learning for Quasi-Geostrophic Turbulence Parametrization.Journal of Advances in Modeling Earth Systems 14, 11 (Nov. 2022)
work page 2022
-
[29]
Germain, M., Pham, H., and Warin, X.Neural networks-based algorithms for stochastic control and PDEs in finance. arXiv:2101.08068 (2021)
-
[30]
Learning to optimize multigrid PDE solvers
Greenfeld, D., Galun, M., Basri, R., Yavneh, I., and Kimmel, R. Learning to optimize multigrid PDE solvers. In Proceedings of the 36th International Conference on Machine Learning (09–15 Jun 2019), K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97 of Proceedings of Machine Learning Research, PMLR, pp. 2415–2423
work page 2019
-
[31]
Grohs, P., and Voigtlaender, F. Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces. Foundations of Compu- tational Mathematics (Jul 2023)
work page 2023
-
[32]
Convolutional Neural Networks for Steady Flow Approxima- tion
Guo, X., Li, W., and Iorio, F. Convolutional Neural Networks for Steady Flow Approxima- tion. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA, 2016), KDD ’16, Association for Computing Machinery, p. 481–490
work page 2016
-
[33]
Solving ordinary differential equations
Hairer, E., and Wanner, G. Solving ordinary differential equations. II , revised ed., vol. 14 of Springer Series in Computational Mathematics . Springer-Verlag, Berlin, 2010. Stiff and differential- algebraic problems. 35
work page 2010
-
[34]
Solving high-dimensional partial differential equations using deep learning
Han, J., Jentzen, A., and E, W. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences 115 , 34 (2018), 8505–8510
work page 2018
-
[35]
scikit- optimize/scikit-optimize, 2021
Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., and Shcherbatyi, I. scikit- optimize/scikit-optimize, 2021
work page 2021
-
[36]
The randomized information complexity of elliptic PDE
Heinrich, S. The randomized information complexity of elliptic PDE. Journal of Complexity 22 , 2 (2006), 220–249
work page 2006
-
[37]
Monte Carlo Complexity of Parametric Integration
Heinrich, S., and Sindambiwe, E. Monte Carlo Complexity of Parametric Integration. Journal of Complexity 15 , 3 (1999), 317–341
work page 1999
-
[38]
Multilevel CNNs for parametric PDEs
Heiß, C., G¨uhring, I., and Eigel, M. Multilevel CNNs for parametric PDEs. arXiv:2304.00388 (2023)
-
[39]
Counterparty Risk Valuation: A Marked Branching Diffusion Approach
Henry-Labordere, P. Counterparty Risk Valuation: A Marked Branching Diffusion Approach. arXiv:1203.2369 (2012)
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[40]
Branching diffusion representation of semilinear pdes and monte carlo approximation
Henry-Labordere, P., Oudjane, N., Tan, X., Touzi, N., Warin, X., et al. Branching diffusion representation of semilinear pdes and monte carlo approximation. In Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques(2019), vol. 55, Institut Henri Poincar´ e, pp. 184–210
work page 2019
-
[41]
E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., and Kingsbury, B. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine 29 , 6 (2012), 82–97
work page 2012
-
[42]
Explicit exponential Runge-Kutta methods for semilinear parabolic problems
Hochbruck, M., and Ostermann, A. Explicit exponential Runge-Kutta methods for semilinear parabolic problems. SIAM J. Numer. Anal. 43 , 3 (2005), 1069–1090
work page 2005
-
[43]
Learning Neural PDE Solvers with Convergence Guarantees
Hsieh, J.-T., Zhao, S., Eismann, S., Mirabella, L., and Ermon, S. Learning Neural PDE Solvers with Convergence Guarantees. arXiv:1906.01200 (2019)
-
[44]
Learning Optimal Multigrid Smoothers via Neural Networks
Huang, R., Li, R., and Xi, Y. Learning Optimal Multigrid Smoothers via Neural Networks. SIAM Journal on Scientific Computing 45 , 3 (2023), S199–S225
work page 2023
-
[45]
On fast simulation of dynamical system with neural vector enhanced numerical solver
Huang, Z., Liang, S., Zhang, H., Yang, H., and Lin, L. On fast simulation of dynamical system with neural vector enhanced numerical solver. Scientific Reports 13, 1 (Sep 2023), 15254
work page 2023
-
[46]
Hutzenthaler, M., Jentzen, A., Kruse, T., Nguyen, T. A., and von Wurstemberger, P. Overcoming the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations. Proc. A. 476, 2244 (2020), 20190630, 25
work page 2020
-
[47]
Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory
Jentzen, A., Kuckuck, B., and von Wurstemberger, P. Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory. arXiv:2310.20360 (2023)
-
[48]
Jovanovi´c, B. S., and S ¨uli, E. Analysis of Finite Difference Schemes . Springer London, 2014
work page 2014
-
[49]
Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., and Yang, L. Physics-informed machine learning. Nature Reviews Physics 3 , 6 (2021), 422–440
work page 2021
-
[50]
Deep Multigrid: learning prolongation and restriction matrices
Katrutsa, A., Daulbaev, T., and Oseledets, I. Deep Multigrid: learning prolongation and restriction matrices. arXiv:1711.03825 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[51]
Solving parametric PDE problems with artificial neural networks
Khoo, Y., Lu, J., and Ying, L. Solving parametric PDE problems with artificial neural networks. European J. Appl. Math. 32 , 3 (2021), 421–435
work page 2021
-
[52]
A., Alieva, A., Wang, Q., Brenner, M
Kochkov, D., Smith, J. A., Alieva, A., Wang, Q., Brenner, M. P., and Hoyer, S. Machine learning-accelerated computational fluid dynamics. Proc. Natl. Acad. Sci. USA 118 , 21 (2021), Paper No. e2101784118, 8. 36
work page 2021
-
[53]
Deep FDM: Enhanced finite difference methods by deep learning
Kossaczk´a, T., Ehrhardt, M., and G ¨unther, M. Deep FDM: Enhanced finite difference methods by deep learning. Franklin Open 4 (2023), 100039
work page 2023
-
[54]
On universal approximation and error bounds for Fourier neural operators
Kovachki, N., Lanthaler, S., and Mishra, S. On universal approximation and error bounds for Fourier neural operators. J. Mach. Learn. Res. 22 (2021), Paper No. [290], 76
work page 2021
-
[55]
Krizhevsky, A., Sutskever, I., and Hinton, G. E. ImageNet Classification with Deep Convo- lutional Neural Networks. In Advances in Neural Information Processing Systems (2012), F. Pereira, C. Burges, L. Bottou, and K. Weinberger, Eds., vol. 25, Curran Associates, Inc
work page 2012
-
[56]
Nonlinear reconstruction for operator learning of PDEs with discontinuities
Lanthaler, S., Molinaro, R., Hadorn, P., and Mishra, S. Nonlinear reconstruction for operator learning of PDEs with discontinuities. arXiv:2210.01074 (2022)
-
[57]
LeVeque, R. J. Finite Difference Methods for Ordinary and Partial Differential Equations. Society for Industrial and Applied Mathematics, 2007
work page 2007
-
[58]
Z., Liu, B., and Anandkumar, A
Li, Z., Huang, D. Z., Liu, B., and Anandkumar, A. Fourier neural operator with learned deformations for PDEs on general geometries. arXiv:2207.05209 (2022)
-
[59]
Neural Operator: Graph Kernel Network for Partial Differential Equations
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Graph kernel network for partial differential equations. arXiv:2003.03485 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2003
-
[60]
Fourier neural operator for parametric partial differential equations
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Fourier neural operator for parametric partial differential equations. In Inter- national Conference on Learning Representations (2021)
work page 2021
-
[61]
List, B., Chen, L.-W., and Thuerey, N. Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons. Journal of Fluid Mechanics 949 (2022), A25
work page 2022
-
[62]
Liu, Y., Kutz, J. N., and Brunton, S. L. Hierarchical deep learning of multiscale differential equation time-steppers. Philos. Trans. Roy. Soc. A 380 , 2229 (2022), Paper No. 20210200, 17
work page 2022
-
[63]
Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3 , 3 (2021), 218–229
work page 2021
-
[64]
Lu, L., Meng, X., Cai, S., Mao, Z., Goswami, S., Zhang, Z., and Karniadakis, G. E. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Computer Methods in Applied Mechanics and Engineering 393 (2022), 114778
work page 2022
-
[65]
Subgrid modelling for two-dimensional turbulence using neural networks
Maulik, R., San, O., Rasheed, A., and Vedula, P. Subgrid modelling for two-dimensional turbulence using neural networks. Journal of Fluid Mechanics 858 (2019), 122–144
work page 2019
-
[66]
A review of deep learning techniques for speech processing
Mehrish, A., Majumder, N., Bhardwaj, R., Mihalcea, R., and Poria, S. A review of deep learning techniques for speech processing. arXiv:2305.00359 (2023)
-
[67]
Mishra, S. A machine learning framework for data driven acceleration of computations of differ- ential equations. Math. Eng. 1 , 1 (2019), 118–146
work page 2019
-
[68]
Nelsen, N. H., and Stuart, A. M. The random feature model for input-output maps between Banach spaces. SIAM J. Sci. Comput. 43 , 5 (2021), A3212–A3243
work page 2021
-
[69]
Y., Penent, G., and Privault, N
Nguwi, J. Y., Penent, G., and Privault, N. A fully nonlinear Feynman-Kac formula with derivatives of arbitrary orders. arXiv:2201.03882 (2022)
-
[70]
Tractability of Multivariate Problems: Standard information for functionals, vol
Novak, E., and Wo´zniakowski, H. Tractability of Multivariate Problems: Standard information for functionals, vol. 12. European Mathematical Society, 2008. 37
work page 2008
-
[71]
Tractability of multivariate problems
Novak, E., and Wo ´zniakowski, H. Tractability of multivariate problems. Vol. 1: Linear in- formation, vol. 6 of EMS Tracts in Mathematics . European Mathematical Society (EMS), Z¨ urich, 2008
work page 2008
-
[72]
OpenAI. GPT-4 Technical Report. arXiv:2303.08774 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[73]
Learning Runge-Kutta integration schemes for ODE simulation and identification
Ouala, S., Debreu, L., Pascual, A., Chapron, B., Collard, F., Gaultier, L., and Fablet, R. Learning Runge-Kutta integration schemes for ODE simulation and identification. arXiv:2105.04999 (2021)
-
[74]
Semigroups of linear operators and applications to partial differential equations , vol
Pazy, A. Semigroups of linear operators and applications to partial differential equations , vol. 44 of Applied Mathematical Sciences. Springer-Verlag, New York, 1983
work page 1983
-
[75]
Mean-field neural networks: learning mappings on Wasserstein space
Pham, H., and Warin, X. Mean-field neural networks: learning mappings on Wasserstein space. arXiv:2210.15179 (2022)
-
[76]
Pre-trained models for natural language processing: A survey
Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., and Huang, X. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63, 10 (Sept. 2020), 1872–1897
work page 2020
-
[77]
Raissi, M., Perdikaris, P., and Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378 (2019), 686–707
work page 2019
-
[78]
Convolutional Neural Operators
Raoni´c, B., Molinaro, R., Rohner, T., Mishra, S., and de Bezenac, E. Convolutional Neural Operators. arXiv:2302.01178 (2023)
-
[79]
Ray, D., and Hesthaven, J. S. An artificial neural network as a troubled-cell indicator. Journal of Computational Physics 367 (2018), 166–191
work page 2018
-
[80]
A primer in BERTology: What we know about how BERT works
Rogers, A., Kovaleva, O., and Rumshisky, A. A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics 8 (2020), 842–866
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.