pith. sign in

arxiv: 2308.13222 · v3 · submitted 2023-08-25 · ⚛️ physics.comp-ph · cs.LG· physics.flu-dyn· stat.ML

Bayesian Reasoning for Physics Informed Neural Networks

Pith reviewed 2026-05-24 08:02 UTC · model grok-4.3

classification ⚛️ physics.comp-ph cs.LGphysics.flu-dynstat.ML
keywords physics-informed neural networksBayesian inferenceLaplace approximationmodel evidencepartial differential equationsuncertainty quantificationhyperparameter optimization
0
0 comments X

The pith

A Laplace approximation enables automatic optimization of loss weights in Bayesian physics-informed neural networks by computing model evidence analytically without sampling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a Bayesian formulation for physics-informed neural networks in which loss weights for PDE residuals, boundary conditions, and data are selected by maximizing the model evidence. It replaces posterior sampling or variational methods with a Laplace approximation that yields an analytic expression for the evidence, allowing efficient hyperparameter tuning and model comparison. Demonstrations on the heat, wave, and Burgers equations produce solutions consistent with exact or reference results, and the Burgers case shows natural integration of noisy measurements with governing equations inside a single uncertainty-aware framework.

Core claim

We introduce an evidence-driven Bayesian formulation of physics-informed neural networks that enables automatic optimization of loss weights between PDE residuals, boundary conditions, and observational data. Unlike existing Bayesian PINN approaches based on sampling or variational inference, the proposed method uses a Laplace approximation to compute model evidence analytically, enabling efficient hyperparameter tuning and model comparison without posterior sampling. We demonstrate the method on the heat, wave, and Burgers' equations, obtaining solutions in agreement with exact or reference results. In the Burgers' equation example, we further show that the framework naturally integrates信息从

What carries the argument

Laplace approximation to the posterior mode, used to obtain an analytic expression for the marginal likelihood (model evidence) that drives loss-weight selection.

If this is right

  • Loss weights between physics residuals and data terms are chosen automatically rather than by manual search.
  • Different PINN models or physics assumptions can be ranked by their computed evidence values.
  • Predictive uncertainty estimates become available within the same training procedure that fits the network.
  • Noisy measurements are incorporated directly alongside the governing equations without separate regularization schedules.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The analytic evidence may allow rapid iteration over network architectures or PDE formulations that would be expensive to compare with sampling-based methods.
  • If the Laplace approximation remains reliable for deeper networks or higher-dimensional PDEs, the method could extend to inverse problems where both parameters and loss weights must be inferred.
  • The framework supplies a concrete route to compare a physics-informed model against a purely data-driven one on the same evidence scale.

Load-bearing premise

The Laplace approximation around the posterior mode yields a sufficiently accurate estimate of the marginal likelihood for the purpose of loss-weight selection in PINN training.

What would settle it

A side-by-side run on the same PDE problems in which the loss weights chosen by the Laplace evidence produce solutions whose error or uncertainty calibration differs substantially from weights obtained by full MCMC sampling or by cross-validation.

Figures

Figures reproduced from arXiv: 2308.13222 by Kornel Witkowski, Krzysztof M. Graczyk.

Figure 1
Figure 1. Figure 1: The above MLP contains two hidden unit layers. Empty squa [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Nonlinear regression within this paper approach (left pane [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: In the left panel: the training data, the blue/red points co [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: An example of loss ET (the first panel in the left), Ehyp (the last panel) and α (the second from the right) and β (the third from the right) hyperparameters evolution during the training of the network that solves the heat equation. We optimized the hyperparameters after 5000 epochs to maintain the procedure’s convergence. Then, every 25 epoch, we updated the α and β parameters to minimize the error Ehyp.… view at source ↗
Figure 5
Figure 5. Figure 5: The exact (utrue) and PINN (upred) solutions for the heat equation. The grey area denotes 2σ uncertainty. 4.3 Wave equation Let us consider the following wave equation ∂ 2u ∂t2 − ∂ 2u ∂x2 = 0, x ∈ [0, 1], t ∈ [0, 1] (47) u(0, t) = u(1, t) = 0 (48) ∂ ∂tu(0, t) = 0 (49) with u(x, 0) = sin(πx) + 1 2 sin(2πx). (50) The analytic solution reads utrue(x, t) = sin(πx) cos(πt) + 1 2 sin(2πx) cos(2πt). In the previo… view at source ↗
Figure 6
Figure 6. Figure 6: In the left panel: the training data, the blue/red points co [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Parameters evolution during the wave equation training fo [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The exact (utrue) and PINN (upred) solutions for the heat equation with boundary conditions (50). The grey area denotes 2σ uncertainty. Similarly, as for the heat equation, we show the histogram of the relative log of evidence values in the right panel of [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of the obtained uncertainties and real error [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: The parameters evolution for the full model. The loss fun [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: In the left panel: the training data, the blue/red points c [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: The ”true” solution (utrue) - the numerical solution from [11], and our best model predictions (upred) for Burger’s equation. The grey area denotes 2σ uncertainty. 5 Summary We adopted and modified the Bayesian framework for the PINN to solve heat, wave, and Burger’s equations. The network solutions agree with the ”true” ones. The method allowed us to compute the uncertainty due to variation of network pa… view at source ↗
read the original abstract

We introduce an evidence-driven Bayesian formulation of physics-informed neural networks that enables automatic optimization of loss weights between PDE residuals, boundary conditions, and observational data. Unlike existing Bayesian PINN approaches based on sampling or variational inference, the proposed method uses a Laplace approximation to compute model evidence analytically, enabling efficient hyperparameter tuning and model comparison without posterior sampling. We demonstrate the method on the heat, wave, and Burgers' equations, obtaining solutions in agreement with exact or reference results. In the Burgers' equation example, we further show that the framework naturally integrates information from governing equations and noisy measurements, providing predictive uncertainties within a unified Bayesian setting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces an evidence-driven Bayesian formulation of physics-informed neural networks (PINNs) that uses a Laplace approximation to compute the model evidence analytically. This enables automatic optimization of loss weights balancing PDE residuals, boundary conditions, and observational data, without requiring posterior sampling or variational inference. The approach is demonstrated on the heat, wave, and Burgers' equations, where solutions agree with exact or reference results, and uncertainties are produced in a unified setting that integrates governing equations with noisy measurements.

Significance. If the Laplace approximation proves sufficiently accurate for the non-convex PINN posteriors, the method would offer an efficient alternative to sampling-based Bayesian PINNs for hyperparameter tuning and model comparison. The analytic evidence computation is a clear strength, as is the unified treatment of physics residuals and data in the Burgers' example. However, the absence of quantitative error metrics, ablation studies on the approximation, and comparisons to existing weight-tuning heuristics limits the immediate impact.

major comments (3)
  1. [Method section (Laplace approximation derivation)] The central claim that the Laplace approximation yields a reliable analytic marginal likelihood for loss-weight selection rests on the assumption that the posterior over network weights is locally quadratic and unimodal. No verification of this is provided (e.g., no check that the Hessian at the MAP estimate is positive definite or that negative eigenvalues are absent), which is load-bearing for the automatic optimization procedure.
  2. [Numerical experiments (heat, wave, Burgers' sections)] Results on the three PDEs report only qualitative agreement with exact/reference solutions. No quantitative metrics (L2 errors, relative errors, or convergence rates) are supplied, nor is there an ablation on how the evidence-based weights compare to manual or heuristic tuning.
  3. [Experiments and discussion] No comparison is made to existing Bayesian PINN approaches (sampling or VI) or to standard loss-weight heuristics, leaving open whether the analytic evidence computation improves upon prior methods in practice.
minor comments (2)
  1. [Method] Notation for the loss weights and evidence terms could be clarified with explicit definitions early in the method section to aid readability.
  2. [Figures] Figure captions should include more detail on what is being plotted (e.g., mean prediction vs. uncertainty bands) for the Burgers' example.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive report and the opportunity to clarify and strengthen the manuscript. We address each major comment below and commit to revisions that directly respond to the concerns raised.

read point-by-point responses
  1. Referee: [Method section (Laplace approximation derivation)] The central claim that the Laplace approximation yields a reliable analytic marginal likelihood for loss-weight selection rests on the assumption that the posterior over network weights is locally quadratic and unimodal. No verification of this is provided (e.g., no check that the Hessian at the MAP estimate is positive definite or that negative eigenvalues are absent), which is load-bearing for the automatic optimization procedure.

    Authors: We agree that verification of the local quadratic assumption is important. In the revised manuscript we will add explicit checks on the Hessian at the MAP estimate for each PDE example, reporting the eigenvalues to confirm positive-definiteness and discussing any implications for the validity of the analytic evidence computation. revision: yes

  2. Referee: [Numerical experiments (heat, wave, Burgers' sections)] Results on the three PDEs report only qualitative agreement with exact/reference solutions. No quantitative metrics (L2 errors, relative errors, or convergence rates) are supplied, nor is there an ablation on how the evidence-based weights compare to manual or heuristic tuning.

    Authors: The current manuscript indeed presents only qualitative results. We will revise the numerical sections to include L2 and relative error metrics against exact or reference solutions, as well as an ablation study that compares evidence-optimized weights against manual tuning and standard heuristics such as gradient-norm balancing. revision: yes

  3. Referee: [Experiments and discussion] No comparison is made to existing Bayesian PINN approaches (sampling or VI) or to standard loss-weight heuristics, leaving open whether the analytic evidence computation improves upon prior methods in practice.

    Authors: We will add a new subsection that benchmarks the Laplace-evidence method against representative sampling-based and variational Bayesian PINNs (where computational resources permit) and against common loss-weight heuristics. The comparison will emphasize wall-clock time and accuracy on the same PDE test cases to quantify practical gains from the analytic evidence route. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on independent Laplace approximation

full rationale

The paper introduces a Bayesian PINN formulation that applies the standard Laplace approximation to compute analytic model evidence for loss-weight optimization. No load-bearing step reduces a claimed prediction or result to a quantity already fitted to the target data by the paper's own equations. No self-citation chains or ansatzes are invoked to justify core premises; the approach is presented as a direct application of existing Bayesian tools to the PINN setting. The central claim therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that the Laplace approximation is valid for the PINN posterior, which is not quantified here.

pith-pipeline@v0.9.0 · 5634 in / 1218 out tokens · 33962 ms · 2026-05-24T08:02:18.029728+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

75 extracted references · 75 canonical work pages · 9 internal anchors

  1. [1]

    LeCun, Y

    Y. LeCun, Y. Bengio, G. Hinton, Deep learning , Nature 521 (2015) 436 EP –. URL https://doi.org/10.1038/nature14539

  2. [2]

    Mehta, M

    P. Mehta, M. Bukov, C.-H. Wang, A. G. Day, C. Richardson, C. K. Fisher, D. J. Schwab, A high-bias, low-variance introduction to machine learning for physic ists, Physics Reports 810 (2019) 1 – 124, a high-bias, low-variance introduction to Machine Le arning for physicists. doi:https://doi.org/10.1016/j.physrep.2019.03.001. URL http://www.sciencedirect.com...

  3. [3]

    K. M. Graczyk, P. Plonski, R. Sulej, Neural Network Parameter izations of Electromagnetic Nu- cleon Form Factors, JHEP 09 (2010) 053. arXiv:1006.0342, doi:10.1007/JHEP09(2010)053

  4. [4]

    K. M. Graczyk, C. Juszczak, Proton radius from Bayesian infer ence, Phys. Rev. C90 (2014) 054334. arXiv:1408.0150, doi:10.1103/PhysRevC.90.054334

  5. [5]

    K. M. Graczyk, M. Matyka, Predicting porosity, permeability, an d tortuosity of porous media from images by deep learning, Scientific reports 10 (1) (2020) 1–11

  6. [6]

    J.-L. Wu, H. Xiao, E. Paterson, Physics-informed machine learning approach for augmenting turbu lence models: A comprehensiv Phys. Rev. Fluids 3 (2018) 074602. doi:10.1103/PhysRevFluids.3.074602. URL https://link.aps.org/doi/10.1103/PhysRevFluids.3.074602

  7. [7]

    Otten, S

    S. Otten, S. Caron, W. de Swart, M. van Beekveld, L. Hendriks, C. van Leeuwen, D. Podareanu, R. Ruiz de Austri, R. Verheyen, Event Generation and Statistical S ampling for Physics with Deep Generative Models and a Density Information Buffer, Nature C ommun. 12 (1) (2021) 2985. arXiv:1901.00875, doi:10.1038/s41467-021-22616-z

  8. [8]

    T. J. Sejnowski, The Deep Learning Revolution, MIT Press, Camb ridge, MA, 2018

  9. [9]

    I. E. Lagaris, A. Likas, D. I. Fotiadis, Artificial neural network s for solving ordinary and partial differential equations, IEEE Transactions on Neural Netw orks 9 (5) (1998) 987–1000. doi:10.1109/72.712178

  10. [10]

    Lagaris, A

    I. Lagaris, A. Likas, D. Fotiadis, Artificial neural network met hods in quantum mechanics, Com- puter Physics Communications 104 (1) (1997) 1 – 14

  11. [11]

    Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed dee p learning (part i): Data-driven solutions of nonlinear partial differential equations, arXiv preprint arXiv:1711.10561 (2017)

  12. [12]

    Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed dee p learning (part ii): Data-driven discovery of nonlinear partial differential equations, arXiv preprin t arXiv:1711.10566 (2017)

  13. [13]

    Raissi, P

    M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neur al networks: A deep learning framework for solving forward and inverse problems involving nonline ar partial differential equa- tions, Journal of Computational Physics 378 (2019) 686–707. 18

  14. [14]

    Mishra, R

    S. Mishra, R. Molinaro, Physics informed neural networks for s imulating radiative trans- fer, Journal of Quantitative Spectroscopy and Radiative Transf er 270 (2021) 107705. doi:10.1016/j.jqsrt.2021.107705

  15. [15]

    Mishra, R

    S. Mishra, R. Molinaro, Estimates on the generalization error of physics informed neural networks (pinns) for approximating a class of inverse problems for pdes (202 1). arXiv:2007.01138

  16. [16]

    Sirignano, K

    J. Sirignano, K. Spiliopoulos, Dgm: A deep learning algorithm for solving partial differential equatio ns, Journal of Computational Physics 375 (2018) 1339–1364. doi:10.1016/j.jcp.2018.08.029. URL http://dx.doi.org/10.1016/j.jcp.2018.08.029

  17. [17]

    L. Lu, X. Meng, Z. Mao, G. E. Karniadakis, DeepXDE: A deep learning library for solving differential equations , SIAM Review 63 (1) (2021) 208–228. doi:10.1137/19m1274067. URL https://doi.org/10.1137%2F19m1274067

  18. [18]

    Haghighat, M

    E. Haghighat, M. Raissi, A. Moure, H. Gomez, R. Juanes, A deep learning framework for solution and discovery in solid mechanic s (2020). doi:10.48550/ARXIV.2003.02751. URL https://arxiv.org/abs/2003.02751

  19. [19]

    Thuerey, P

    N. Thuerey, P. Holl, M. Mueller, P. Schnell, F. Trost, K. Um, Physics-based Deep Learning , WWW, 2021. URL https://physicsbaseddeeplearning.org

  20. [20]

    Z. Hao, J. Yao, C. Su, H. Su, Z. Wang, F. Lu, Z. Xia, Y. Zhang, S . Liu, L. Lu, J. Zhu, Pin- nacle: A comprehensive benchmark of physics-informed neural ne tworks for solving pdes (2023). arXiv:2306.08827

  21. [21]

    Jiang, X

    S. Jiang, X. Li, Solving non-local fokker-planck equations by deep learning (2022). doi:10.48550/ARXIV.2206.03439. URL https://arxiv.org/abs/2206.03439

  22. [22]

    Subramanian, R

    S. Subramanian, R. M. Kirby, M. W. Mahoney, A. Gholami, Adaptive self-supervision algorithms for physics-informed neural networks (2022). doi:10.48550/ARXIV.2207.04084. URL https://arxiv.org/abs/2207.04084

  23. [23]

    Z. Long, Y. Lu, X. Ma, B. Dong, Pde-net: Learning pdes from data (2017). doi:10.48550/ARXIV.1710.09668. URL https://arxiv.org/abs/1710.09668

  24. [24]

    S., Giampaolo, F., Rozza, G., Raissi, M., et al

    S. Cuomo, V. S. di Cola, F. Giampaolo, G. Rozza, M. Raissi, F. Picc ialli, Scientific machine learning through physics-informed neural networks: Where we ar e and what’s next (2022). arXiv:2201.05624

  25. [25]

    G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wan g, L. Yang, Physics-informed machine learning, Nature Reviews Physics 3 (6) (2021) 422–440

  26. [26]

    S. A. Faroughi, N. Pawar, C. Fernandes, M. Raissi, S. Das, N. K . Kalantari, S. K. Mahjour, Physics-guided, physics-informed, and physics-encoded neural networks in scientific computing (2023). arXiv:2211.07377

  27. [27]

    C. Meng, S. Seo, D. Cao, S. Griesemer, Y. Liu, When physics meets machine learning: A survey of physics-informed mac (2022). doi:10.48550/ARXIV.2203.16797. URL https://arxiv.org/abs/2203.16797

  28. [28]

    Z. Hao, S. Liu, Y. Zhang, C. Ying, Y. Feng, H. Su, J. Zhu, Physic s-informed machine learning: A survey on problems, methods and applications (2023). arXiv:2211.08064

  29. [29]

    Rosenblatt, Principles of Neurodynamics , New York: Spartan, 1962

    F. Rosenblatt, Principles of Neurodynamics , New York: Spartan, 1962. URL http://www.dtic.mil/docs/citations/AD0256582 19

  30. [30]

    Hertz, Introduction To The Theory Of Neural Computation , CRC Press, 2018

    J. Hertz, Introduction To The Theory Of Neural Computation , CRC Press, 2018. URL https://books.google.pl/books?id=NwpQDwAAQBAJ

  31. [31]

    Goodfellow, Y

    I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press , 2016, http://www.deeplearningbook.org

  32. [32]

    M., Bishop, Neural Networks for Pattern Recognition, Oxfo rd University Press, 1995

    C. M., Bishop, Neural Networks for Pattern Recognition, Oxfo rd University Press, 1995

  33. [33]

    D’Agostini, Bayesian Reasoning in Data Analysis , World Scientific, 2003

    G. D’Agostini, Bayesian Reasoning in Data Analysis , World Scientific, 2003. URL http://www.worldscientific.com/worldscibooks/10.1142/5262

  34. [34]

    Jeffreys, Theory of Probability, Oxford University Press, 1 961

    H. Jeffreys, Theory of Probability, Oxford University Press, 1 961

  35. [35]

    Gawlikowski, C

    J. Gawlikowski, C. R. N. Tassi, M. Ali, J. Lee, M. Humt, J. Feng, A. Kruspe, R. Triebel, P. Jung, R. Roscher, M. Shahzad, W. Yang, R. Bamler, X . X. Zhu, A survey of uncertainty in deep neural networks , Artificial Intelligence Review 56 (1) (2023) 1513–1589. doi:10.1007/s10462-023-10562-9 . URL https://doi.org/10.1007/s10462-023-10562-9

  36. [36]

    S. F. Gull, Bayesian Inductive Inference and Maximum Entropy , Springer Netherlands, Dor- drecht, 1988, pp. 53–74. doi:10.1007/978-94-009-3049-0_4 . URL https://doi.org/10.1007/978-94-009-3049-0_4

  37. [37]

    A. F. Psaros, X. Meng, Z. Zou, L. Guo, G. E. Karniadakis, Uncertainty quantification in scientific machine learning: Methods, m etrics, and comparisons , Journal of Computational Physics 477 (2023) 111902. doi:https://doi.org/10.1016/j.jcp.2022.111902. URL https://www.sciencedirect.com/science/article/pii/S0021999122009652

  38. [38]

    Y. Zhu, N. Zabaras, Bayesian deep convolutional encoder–de coder networks for surrogate mod- eling and uncertainty quantification, Journal of Computational Ph ysics 366 (2018) 415–447. doi:10.1016/j.jcp.2018.04.018

  39. [39]

    Geneva, N

    N. Geneva, N. Zabaras, Modeling the dynamics of pde systems with physics-constrained dee p auto-regressive networks Journal of Computational Physics 403 (2020) 109056. doi:https://doi.org/10.1016/j.jcp.2019.109056. URL https://www.sciencedirect.com/science/article/pii/S0021999119307612

  40. [40]

    Besginow, M

    A. Besginow, M. Lange-Hegermann, Constraining gaussian processes to systems of linear ordinary diffe rential equations (2022). doi:10.48550/ARXIV.2208.12515. URL https://arxiv.org/abs/2208.12515

  41. [41]

    Rasmussen, C

    C. Rasmussen, C. Williams, Gaussian Processes for Machine Lear ning, Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, USA, 2006

  42. [42]

    Raissi, P

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Machine learning of line ar differential equa- tions using gaussian processes, Journal of Computational Physic s 348 (2017) 683 – 693. doi:https://doi.org/10.1016/j.jcp.2017.07.050

  43. [43]

    Raissi, P

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Inferring solutions o f differential equations using noisy multi-fidelity data, Journal of Computational Physics 335 (2017) 7 36–746

  44. [44]

    R. M. Neal, Bayesian learning for neural networks, Ph.D. thesis , Graduate Department of Com- puter Science in University of Toronto (1995)

  45. [45]

    L. Yang, X. Meng, G. E. Karniadakis, B-pinns: Bayesian physics-informed neural networks for forwar d and inverse pde Journal of Computational Physics 425 (2021) 109913. doi:10.1016/j.jcp.2020.109913. URL http://dx.doi.org/10.1016/j.jcp.2020.109913

  46. [46]

    Bonneville, C

    C. Bonneville, C. Earls, Bayesian deep learning for partial differe ntial equation parameter discov- ery with sparse and noisy data, Journal of Computational Physics : X 16 (2022) 100115

  47. [47]

    K. More, T. Tripura, R. Nayek, S. Chakraborty, A bayesian fr amework for learning governing partial differential equation from data (2023). arXiv:2306.04894. 20

  48. [48]

    L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning no nlinear operators via deeponet based on the universal approximation theorem of operators, Nat ure Machine Intelligence 3 (3) (2021) 218–229. doi:10.1038/s42256-021-00302-5

  49. [49]

    Magris, A

    M. Magris, A. Iosifidis, Bayesian learning for neural networks: an algorithmic survey (2022). doi:10.48550/ARXIV.2211.11865. URL https://arxiv.org/abs/2211.11865

  50. [50]

    MacKay, Bayesian methods for adaptive models, Ph.D

    D. MacKay, Bayesian methods for adaptive models, Ph.D. thesis , California Institute of Technol- ogy (1991)

  51. [51]

    D. J. C. MacKay, Bayesian interpolation, Neural Computation 4 (3) (1992) 415–447. arXiv:https://doi.org/10.1162/neco.1992.4.3.415, doi:10.1162/neco.1992.4.3.415. URL https://doi.org/10.1162/neco.1992.4.3.415

  52. [52]

    D. J. C. MacKay, A practical bayesian framework for backpropagation networks , Neural Com- putation 4 (3) (1992) 448–472. arXiv:https://doi.org/10.1162/neco.1992.4.3.448, doi:10.1162/neco.1992.4.3.448. URL https://doi.org/10.1162/neco.1992.4.3.448

  53. [53]

    S. F. Gull, Developments in Maximum Entropy Data Analysis , Springer Netherlands, Dordrecht, 1989, pp. 53–71. doi:10.1007/978-94-015-7860-8_4 . URL https://doi.org/10.1007/978-94-015-7860-8_4

  54. [54]

    Skilling, On Parameter Estimation and Quantified Maxent , Springer Netherlands, Dordrecht, 1991, pp

    J. Skilling, On Parameter Estimation and Quantified Maxent , Springer Netherlands, Dordrecht, 1991, pp. 267–273. doi:10.1007/978-94-011-3460-6_25 . URL https://doi.org/10.1007/978-94-011-3460-6_25

  55. [55]

    Nucleon axial form factor from a Bayesian neural-network analysis of neutrino-scattering data

    L. Alvarez-Ruso, K. M. Graczyk, E. Saul-Sala, Nucleon axial fo rm factor from a Bayesian neural-network analysis of neutrino-scattering data, Phys. Rev . C 99 (2) (2019) 025204. arXiv:1805.00905, doi:10.1103/PhysRevC.99.025204

  56. [56]

    K. M. Graczyk, C. Juszczak, Zemach moments of the proton f rom Bayesian inference, Phys. Rev. C91 (4) (2015) 045205. doi:10.1103/PhysRevC.91.045205

  57. [57]

    K. M. Graczyk, C. Juszczak, Applications of Neural Networks in Hadron Physics, J. Phys. G42 (3) (2015) 034019. arXiv:1409.5244, doi:10.1088/0954-3899/42/3/034019

  58. [58]

    K. M. Graczyk, Two-Photon Exchange Effect Studied with Neur al Networks, Phys. Rev. C84 (2011) 034314. arXiv:1106.1204, doi:10.1103/PhysRevC.84.034314

  59. [59]

    Paszke, S

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Ch anan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, S. Chintala, Pytorch: An imperative sty le, high-performance deep learning library, in: Advances in Neural Information Processing Sys ...

  60. [60]

    SimNet” has been changed to “Modulus

    O. Hennigh, S. Narasimhan, M. A. Nabian, A. Subramaniam, K. Ta ngsali, M. Rietmann, J. del Aguila Ferrandis, W. Byeon, Z. Fang, S. Choudhry, Nvidia simnet T M: an ai-accelerated multi- physics simulation framework (2020). arXiv:2012.07938

  61. [61]

    C. L. Wight, J. Zhao, Solving allen-cahn and cahn-hilliard equation s using the adaptive physics informed neural networks (2020). arXiv:2007.04542

  62. [62]

    S. Wang, Y. Teng, P. Perdikaris, Understanding and mitigating g radient pathologies in physics- informed neural networks (2020). arXiv:2001.04536

  63. [63]

    S. Wang, X. Yu, P. Perdikaris, When and why pinns fail to train: A neural tangent kernel perspective (2020). arXiv:2007.14527

  64. [64]

    McClenny, U

    L. McClenny, U. Braga-Neto, Self-adaptive physics-informed neural networks using a soft attention mechanism (2022). arXiv:2009.04544. 21

  65. [65]

    Cybenko, Approximation by superpositions of a sigmoidal function , Math Control, Signal 2 (4) (1989) 303

    G. Cybenko, Approximation by superpositions of a sigmoidal function , Math Control, Signal 2 (4) (1989) 303. doi:10.1007/BF02551274. URL http://dx.doi.org/10.1007/BF02551274

  66. [66]

    Hornik, M

    K. Hornik, M. Sinchcombe, W. Halbert, Multilayer feedforward n et- works are universal approximators, Neural Networks 2 (1989) 3 59. doi:http://www.sciencedirect.com/science/article/pii/0893608089900208

  67. [67]

    Funahashi, On the approximate realization of continuous mappings by neural ne tworks, Neural Networks 2 (3) (1989) 183 – 192

    K.-I. Funahashi, On the approximate realization of continuous mappings by neural ne tworks, Neural Networks 2 (3) (1989) 183 – 192. doi:https://doi.org/10.1016/0893-6080(89)90003-8. URL http://www.sciencedirect.com/science/article/pii/0893608089900038

  68. [68]

    Geman, E

    S. Geman, E. Bienenstock, R. Doursat, Neural networks and the bias/variance dilemma , Neu- ral Computation 4 (1) (1992) 1–58. arXiv:https://doi.org/10.1162/neco.1992.4.1.1, doi:10.1162/neco.1992.4.1.1. URL https://doi.org/10.1162/neco.1992.4.1.1

  69. [69]

    Hanson, J

    R. Hanson, J. Stutz, P. Cheeseman, Bayesian classification th eory, Tech. Rep. Technical Report F[A-90-12-7-01, NASA (05 1991)

  70. [70]

    J. O. Berger, J. O. Berger, Statistical decision theory and Ba yesian analysis, Springer-Verlag, New York, 1985

  71. [71]

    A. M. Chen, H.-m. Lu, R. Hecht-Nielsen, On the Geometry of Fee dforward Neural Network Error Surfaces, Neural Computation 5 (6) (199 3) 910–927. arXiv:https://direct.mit.edu/neco/article-pdf/5/6/9 10/812656/neco.1993.5.6.910.pdf, doi:10.1162/neco.1993.5.6.910

  72. [72]

    H. H. Thodberg, Ace of bayes : Application of neural , 1993. URL https://api.semanticscholar.org/CorpusID:15593225

  73. [73]

    D. P. Kingma, J. Ba, Adam: A method for stochastic optimization (2017). arXiv:1412.6980

  74. [74]

    Zhang, P

    Y. Zhang, P. Khanduri, I. Tsaknakis, Y. Yao, M. Hong, S. Liu, A n introduction to bi-level optimization: Foundations and applications in signal processing and m achine learning (2023). arXiv:2308.00788

  75. [75]

    Brooks, A

    S. Brooks, A. Gelman, G. Jones, X.-L. Meng, Handbook of Markov Chain Monte Carlo , Chapman and Hall/CRC, 2011. doi:10.1201/b10905. URL http://dx.doi.org/10.1201/b10905 22