pith. sign in

arxiv: 2605.16078 · v1 · pith:YAVSXEKDnew · submitted 2026-05-15 · 📊 stat.ML · cs.LG

A numerical study into neural network surrogate model performance for uncertainty propagation

Pith reviewed 2026-05-19 18:53 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords neural networkssurrogate modelsuncertainty propagationstochastic PDEphysics-informed learningextrapolationheat conductionweak residual loss
0
0 comments X

The pith

Neural network surrogates for stochastic heat conduction exhibit order-of-magnitude larger errors at distribution tails due to extrapolation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper explores the performance of neural network models as efficient replacements for traditional solvers when estimating how uncertainty in a random heat source affects the temperature field across an entire probability distribution. It highlights that while models may accurately predict typical outcomes, they struggle significantly with rare extreme cases, leading to much higher errors there. This matters because uncertainty propagation in engineering often requires reliable estimates of worst-case scenarios for safety and design. The study compares different network types and training methods, finding that a basic fully connected network trained with a weak-form physics loss handles these difficult cases most effectively.

Core claim

Serving as a test case is the heat conduction equation driven by a highly stochastic source term that causes large variations in the temperature solution. Neural network surrogates, including fully connected networks and Deep Operator Networks, are trained using both data-driven and physics-informed losses. The results indicate that the largest errors arise on inputs that require extrapolation beyond the training data, and that these errors are substantially bigger than those for average inputs. The fully connected network trained with a weak form residual loss yields the best accuracy on the numerically generated test datasets, particularly for the extreme samples.

What carries the argument

The weak form residual loss for training fully connected neural networks on the stochastic heat conduction problem, which integrates the governing equation over the domain to enforce physical consistency and aid generalization to out-of-distribution inputs.

If this is right

  • Errors in surrogate predictions for uncertainty propagation are primarily driven by the need to extrapolate to extreme stochastic inputs.
  • Using a weak-form residual loss during training improves a model's ability to handle samples outside the training distribution.
  • Deep Operator Networks do not outperform simpler fully connected networks in this setting for tail accuracy.
  • Explicit identification of outlier samples is necessary to properly account for their contribution to overall uncertainty.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Adaptive sampling strategies during dataset generation could focus on filling in the tails to reduce extrapolation demands.
  • The findings may extend to other stochastic boundary value problems where solution variability is high.
  • Integrating uncertainty quantification techniques with these surrogates could provide error bounds for extreme predictions.

Load-bearing premise

The sampled training datasets cover the probability space of the stochastic source term sufficiently well that the observed large errors on extreme samples stem mainly from extrapolation rather than from inadequate training or model limitations.

What would settle it

Generating an augmented training set that includes additional samples from the tails of the source term distribution and then re-evaluating the maximum prediction errors on a held-out extreme test set; if the errors remain similarly large, the extrapolation explanation would be weakened.

Figures

Figures reproduced from arXiv: 2605.16078 by Kirubel Teferra, Noah Wade.

Figure 1
Figure 1. Figure 1: Solution fields from Source 2 (defined in Table 1) showing the: A) Maximum Peak, B) Minimum Peak, C) Median Temperature [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Histogram of the maximum temperatures of the solution field over 2000 samples for Source 2. The range covers over 400 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Schematic of the neural network structures for (a) the fully connected neural network and (b) the PCA DeepONet. The bias vector of the last layer on both networks is denoted as bf to indicate that it is not a trainable variable but instead is fixed to the mean value of the prescribed Dirichlet boundary conditions. Values of the hyperparameters such as the number and width of each layer can be found in [PI… view at source ↗
Figure 4
Figure 4. Figure 4: The PINN model benchmark for sample #196 from Source 2 has the highest ME of all benchmark tests. The error is largely [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Convergence history of the maximum error (ME) during training for PINNs trained on an increasing number of random [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Histograms of the MPSE of the PINN model for (a) training data, and (b) test data for increasing number of samples. [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The predicted solution field from Source 1 with the greatest MPSE, where the true temperature is underestimated by [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Maximum per-sample errors (MPSE) histograms for each models’ (a) training data and (b) testing data. [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The maximum ground truth temperature of each sample is plotted against its (a) maximum per-sample error (MPSE) and (b) the ℓ 8 -norm of the difference between the model predictions and the ground truth divided by the ℓ 8 -norm of the ground truth for all 1500 testing samples for Source 2. The ℓ 8 -norm is also shown to demonstrate that the trend shown in (a) is not the result of numerical sensitivity assoc… view at source ↗
Figure 10
Figure 10. Figure 10: Relationships between sample weight and the (a) training error and (b) maximum solution field temperature for Source 2 [PITH_FULL_IMAGE:figures/full_fig_p021_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: A comparison of the weighted vs. non-weighted PINN. Each model is trained on the same 200 training samples and each [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Convergence rates for PINN models trained on a dataset of 200 samples using both weighted and non-weighted loss functions. [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Plots of the test data’s maximum ground truth temperature versus MPSE for the PINN model trained on 400 samples from Source 1. The samples are demarcated through color coding based on if they are flagged as an outlier (circle) or not (x’s) in (a) and further delineated by type of outlier as outlier temperature and outlier input (diamonds), outlier temperature (squares), outlier input only (triangles), and… view at source ↗
Figure 14
Figure 14. Figure 14: Plots of ground truth maximum temperature versus MPSE for Source 2 test data for the PINN model, coded by whether the MPSE is located on the boundary (circles) or interior of the domain (x’s) for (a) wb = 107 and (b) wb = 108 . Adjusting the boundary weight dramatically changes the MPSE. When the boundary weight (wb) is set too high, training is over-constrained and the model fails to adequately learn to … view at source ↗
read the original abstract

Neural network surrogate models have emerged as a promising approach to model solution fields for a wide variety of boundary value problems encountered in physical modeling. Stochastic problems represent an area of particularly high interest because of the potential to significantly reduce the repeated evaluation of expensive forward models via traditional numerical solvers when conducting parametric analysis. However, many studies found in the literature primarily focus on the ability of neural network surrogate models to represent deterministic samples or mean field solutions and largely overlook surrogate model performance at the tails of the distribution. The present study examines in detail the ability of neural network surrogate models to capture the full distribution of solution fields over the entire probability space, while emphasis is placed at the tails of the distribution. Serving as a canonical problem is the heat conduction equation with a highly stochastic source term, inducing extremely large variation in the thermal solution field. Comparisons are made between a classic feed-forward fully connected network and a Deep Operator Network architecture, using both data-driven and physics-informed loss functions. Results show that the worst-case prediction errors are an order of magnitude larger than the mean field error, highlighting the importance of the outlier samples. The large errors associated with extreme samples result from the networks having to extrapolate beyond the bounds of the training data. A method for identifying these samples is presented along with a discussion of potential approaches to account of their errors. Among the models considered, the fully connected neural network trained using a weak form residual loss performs best in handling these extrapolated inputs, achieving the highest prediction accuracy for the numerically produced datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents a numerical study comparing neural network surrogate models (fully connected networks and DeepONets) with data-driven and physics-informed loss functions (including weak-form residual) for uncertainty propagation in the heat conduction equation subject to a highly stochastic source term. Emphasis is placed on performance across the full probability space, particularly at the distribution tails. Key results include an order-of-magnitude gap between mean and worst-case prediction errors, attribution of large errors on extreme samples to extrapolation beyond training bounds, presentation of a method to identify such samples, and the conclusion that the fully connected network trained with weak-form residual loss performs best on extrapolated inputs for the generated datasets.

Significance. If the experimental details hold, the work usefully draws attention to the practical challenges of achieving reliable surrogate coverage over entire distributions in stochastic PDE problems rather than just mean fields. The reported order-of-magnitude disparity between average and outlier errors, together with the explicit comparison of architectures and loss formulations, supplies concrete numerical evidence that could guide model selection in uncertainty quantification applications. The identification method for extreme samples is a potentially reusable contribution.

major comments (1)
  1. [Dataset generation and numerical experiments] The description of training dataset generation (referenced in the abstract and results discussion) provides no quantitative information on Monte Carlo sample count, tail quantiles retained, or coverage metrics for the stochastic source term distribution. This detail is load-bearing for the central claim that large errors on extreme samples arise primarily from extrapolation rather than under-sampling of tails, optimization failure, or capacity limits; without it, the ranking among models could be confounded by unequal handling of rare but in-distribution events.
minor comments (1)
  1. The abstract states that 'a method for identifying these samples is presented' but does not reference the specific section, figure, or algorithm number where this method appears, making it difficult for readers to locate and evaluate the procedure.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful and constructive review of our manuscript. The major comment on dataset generation details is well taken, and we address it point by point below. We will revise the manuscript to incorporate the requested quantitative information.

read point-by-point responses
  1. Referee: [Dataset generation and numerical experiments] The description of training dataset generation (referenced in the abstract and results discussion) provides no quantitative information on Monte Carlo sample count, tail quantiles retained, or coverage metrics for the stochastic source term distribution. This detail is load-bearing for the central claim that large errors on extreme samples arise primarily from extrapolation rather than under-sampling of tails, optimization failure, or capacity limits; without it, the ranking among models could be confounded by unequal handling of rare but in-distribution events.

    Authors: We agree that the current manuscript provides only a qualitative description of the training dataset generation and lacks the specific quantitative details requested. We will add these to a new or expanded subsection on dataset generation, reporting the Monte Carlo sample count, the quantile thresholds used to identify tail samples, and coverage metrics (such as the spanned range of the stochastic source term and any binning or density checks performed). This revision will directly support the claim that the large errors on extreme samples result from extrapolation beyond training bounds. We will also add a short discussion clarifying that all compared models were trained and tested on identical datasets, so relative rankings are not confounded by differential sampling of rare events; we will further note the diagnostic checks (e.g., loss convergence and residual norms) that indicate optimization and capacity were not the dominant factors for the observed outliers. revision: yes

Circularity Check

0 steps flagged

No circularity: results from direct numerical experiments on standard problem

full rationale

The paper is an empirical numerical study comparing neural network architectures and loss functions (data-driven vs. physics-informed, fully connected vs. DeepONet) on a canonical stochastic heat conduction problem. All performance claims, including the ranking of the fully connected network with weak-form residual loss on extrapolated samples, are obtained from training on generated datasets and evaluating error metrics against reference solutions. No derivation chain exists that reduces a claimed prediction or first-principles result to its own inputs by construction; there are no self-definitional equations, fitted parameters renamed as predictions, load-bearing self-citations, or uniqueness theorems imported from prior author work. The methodology is self-contained and externally falsifiable via direct numerical solver comparisons.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The study rests on standard assumptions in scientific machine learning for PDE surrogates. No new entities are postulated. Free parameters include typical neural network hyperparameters that are tuned but not detailed in the abstract.

free parameters (1)
  • Neural network architecture and training hyperparameters
    Choices such as number of layers, neurons per layer, learning rate, and batch size that are selected to achieve the reported performance but not enumerated in the abstract.
axioms (1)
  • domain assumption The stochastic source term produces extremely large variation in the thermal solution field
    Invoked to justify the canonical problem choice and the focus on tail performance.

pith-pipeline@v0.9.0 · 5799 in / 1201 out tokens · 82424 ms · 2026-05-19T18:53:32.517939+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 3 internal anchors

  1. [1]

    I. E. Lagaris, A. Likas, D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Transactions on Neural Networks 9 (1998) 987–1000

  2. [2]

    Raissi, P

    M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computa- tional Physics 378 (2019) 686–707. doi:10.1016/j.jcp.2018.10.045

  3. [3]

    S. Cai, Z. Wang, S. Wang, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks for heat transfer problems, Journal of Heat Transfer 143 (2021)

  4. [4]

    Sukumar, A

    N. Sukumar, A. Srivastava, Exact imposition of boundary conditions with distance functions in physics- informed deep neural networks, Computer Methods in Applied Mechanics and Engineering 389 (2022) 114333

  5. [5]

    R. Xu, D. Zhang, M. Rong, N. Wang, Weak form theory-guided neural network (TgNN-wf) for deep learning of subsurface single- and two-phase flow, Journal of Computational Physics 436 (2021) 110318. doi:10.1016/j.jcp.2021.110318

  6. [6]

    S. Cai, Z. Mao, Z. Wang, M. Yin, G. E. Karniadakis, Physics-informed neural networks (PINNs) for fluid mechanics: A review, Acta Mechanica Sinica 37 (2021) 1727–1738

  7. [7]

    Goswami, M

    S. Goswami, M. Yin, Y. Yu, G. E. Karniadakis, A physics-informed variational DeepONet for predicting crack path in quasi-brittle materials, Computer Methods in Applied Mechanics and Engineering 391 (2022) 114587. doi:10.1016/j.cma.2022.114587

  8. [8]

    A. A. Howard, M. Perego, G. E. Karniadakis, P. Stinis, Multifidelity deep operator networks for data-driven and physics-informed problems, Journal of Computational Physics 493 (2023) 112462. 25

  9. [9]

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, A. Anandkumar, Fourier neural operator for parametric partial differential equations, arXiv preprint arXiv:2010.08895 (2020)

  10. [10]

    Zhang, W

    W. Zhang, W. Suo, J. Song, W. Cao, Physics informed neural networks (PINNs) as intelligent comput- ing technique for solving partial differential equations: Limitation and future prospects, arXiv preprint arXiv:2411.18240 (2024)

  11. [11]

    Markidis, The old and the new: Can physics-informed deep-learning replace traditional linear solvers?, Frontiers in big Data 4 (2021) 669097

    S. Markidis, The old and the new: Can physics-informed deep-learning replace traditional linear solvers?, Frontiers in big Data 4 (2021) 669097

  12. [12]

    F. F. de la Mata, A. Gij´ on, M. Molina-Solana, J. G´ omez-Romero, Physics-informed neural networks for data-driven simulation: Advantages, limitations, and opportunities, Physica A: Statistical Mechanics and its Applications 610 (2023) 128415

  13. [13]

    T. G. Grossmann, U. J. Komorowska, J. Latz, C.-B. Sch¨ onlieb, Can physics-informed neural networks beat the finite element method?, IMA Journal of Applied Mathematics 89 (2024) 143–174

  14. [14]

    Sacchetti, B

    A. Sacchetti, B. Bachmann, K. L¨ offel, U.-M. K¨ unzi, B. Paoli, Neural networks to solve partial differential equations: A comparison with finite elements, IEEE Access 10 (2022) 32271–32279

  15. [15]

    J. Hou, Y. Li, S. Ying, Enhancing PINNs for solving PDEs via adaptive collocation point movement and adaptive loss weighting, Nonlinear Dynamics 111 (2023) 15233–15261

  16. [16]

    M. A. Nabian, R. J. Gladstone, H. Meidani, Efficient training of physics-informed neural networks via impor- tance sampling, Computer-Aided Civil and Infrastructure Engineering 36 (2021) 962–977

  17. [17]

    A. Daw, J. Bu, S. Wang, P. Perdikaris, A. Karpatne, Rethinking the importance of sampling in physics-informed neural networks, arXiv preprint arXiv:2207.02338 (2022)

  18. [18]

    Xiang, W

    Z. Xiang, W. Peng, X. Liu, W. Yao, Self-adaptive loss balanced physics-informed neural networks, Neurocom- puting 496 (2022) 11–34

  19. [19]

    Zhang, S

    J. Zhang, S. Zhang, G. Lin, Multiauto-deeponet: A multi-resolution autoencoder deeponet for nonlinear dimen- sion reduction, uncertainty quantification and operator learning of forward and inverse stochastic problems, arXiv preprint arXiv:2204.03193 (2022)

  20. [20]

    Y. Yang, P. Perdikaris, Adversarial uncertainty quantification in physics-informed neural networks, Journal of Computational Physics 394 (2019) 136–152

  21. [21]

    L¨ utjens, C

    B. L¨ utjens, C. H. Crawford, M. Veillette, D. Newman, Pce-pinns: Physics-informed neural networks for uncertainty propagation in ocean modeling, arXiv preprint arXiv:2105.02939 (2021)

  22. [22]

    Z. Hu, K. Shukla, G. E. Karniadakis, K. Kawaguchi, Tackling the curse of dimensionality with physics-informed neural networks, Neural Networks 176 (2024) 106369. 26

  23. [23]

    Zhang, L

    D. Zhang, L. Guo, G. E. Karniadakis, Learning in modal space: Solving time-dependent stochastic pdes using physics-informed neural networks, SIAM Journal on Scientific Computing 42 (2020) A639–A665

  24. [24]

    Q. Lin, C. Zhang, X. Meng, Z. Guo, Monte carlo physics-informed neural networks for multiscale heat con- duction via phonon boltzmann transport equation, arXiv preprint arXiv:2408.10965 (2024)

  25. [25]

    M. A. Nabian, H. Meidani, A deep learning solution approach for high-dimensional random differential equa- tions, Probabilistic Engineering Mechanics 57 (2019) 14–25

  26. [26]

    L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via deeponet based on the universal approximation theorem of operators, Nature machine intelligence 3 (2021) 218–229

  27. [27]

    Z. Li, H. Zheng, N. Kovachki, D. Jin, H. Chen, B. Liu, K. Azizzadenesheli, A. Anandkumar, Physics-informed neural operator for learning partial differential equations, ACM/JMS Journal of Data Science 1 (2024) 1–27

  28. [28]

    Kontolati, S

    K. Kontolati, S. Goswami, G. Em Karniadakis, M. D. Shields, Learning nonlinear operators in latent spaces for real-time predictions of complex dynamics in physical systems, Nature Communications 15 (2024) 5101

  29. [29]

    Exenberger, S

    J. Exenberger, S. Ranftl, R. Peharz, Deep polynomial chaos expansion, arXiv preprint arXiv:2507.21273 (2025)

  30. [30]

    Kumar, S

    V. Kumar, S. Goswami, K. Kontolati, M. D. Shields, G. E. Karniadakis, Synergistic learning with multi-task deeponet for efficient PDE problem solving, Neural Networks 184 (2025) 107113

  31. [31]

    Karumuri, L

    S. Karumuri, L. Graham-Brady, S. Goswami, Physics-informed latent neural operator for real-time predictions of complex physical systems, arXiv preprint arXiv:2501.08428 (2025)

  32. [32]

    L. Kong, J. Sun, C. Zhang, Sde-net: Equipping deep neural networks with uncertainty estimates, arXiv preprint arXiv:2008.10546 (2020)

  33. [33]

    Pickering, S

    E. Pickering, S. Guth, G. E. Karniadakis, T. P. Sapsis, Discovering and forecasting extreme events via active learning in neural operators, Nature Computational Science 2 (2022) 823–833

  34. [34]

    Wabartha, A

    M. Wabartha, A. Durand, V. Francois-Lavet, J. Pineau, Handling black swan events in deep learning with diversely extrapolated neural networks, in: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 2140–2147

  35. [35]

    Anderson, The Long Tail: Why the Future of Business is Selling Less of More, Hyperion, New York, 2006

    C. Anderson, The Long Tail: Why the Future of Business is Selling Less of More, Hyperion, New York, 2006

  36. [36]

    Y. Cui, M. Jia, T.-Y. Lin, Y. Song, S. Belongie, Class-balanced loss based on effective number of samples, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9268–9277

  37. [37]

    L. Yang, H. Jiang, Q. Song, J. Guo, A survey on long-tailed visual recognition, International Journal of Computer Vision 130 (2022) 1837–1872

  38. [38]

    S. H. Rudy, T. P. Sapsis, Output-weighted and relative entropy loss functions for deep learning precursors of extreme events, Physica D: Nonlinear Phenomena 443 (2023) 133570. 27

  39. [39]

    C. Song, R. Xiao, C. Zhang, X. Zhao, B. Sun, Simulation-free reliability analysis with importance sampling- based adaptive training physics-informed neural networks: Method and application to chloride penetration, Reliability Engineering & System Safety 246 (2024) 110083

  40. [40]

    T. Chen, H. Chen, Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems, IEEE transactions on neural networks 6 (1995) 911–917

  41. [41]

    Kovachki, Z

    N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stuart, A. Anandkumar, Neural operator: Learning maps between function spaces with applications to pdes, Journal of Machine Learning Research 24 (2023) 1–97

  42. [42]

    Steininger, K

    M. Steininger, K. Kobs, P. Davidson, A. Krause, A. Hotho, Density-based weighting for imbalanced regression, Machine Learning 110 (2021) 2187–2211

  43. [43]

    S. Wang, H. Wang, P. Perdikaris, Learning the solution operator of parametric partial differential equations with physics-informed DeepONets, Science advances 7 (2021) eabi8605

  44. [44]

    D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014)

  45. [45]

    TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

    M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, et al., Tensorflow: Large-scale machine learning on heterogeneous distributed systems, arXiv preprint arXiv:1603.04467 (2016)

  46. [46]

    F. P. Preparata, M. I. Shamos, Computational Geometry: An Introduction, Springer Science & Business Media, 2012

  47. [47]

    N. Wade, K. Teferra, https://github.com/USNavalResearchLaboratory/UQPINN, 2026. 28