Physics-Informed Neural Networks for coupled stiff transport systems
Pith reviewed 2026-06-26 12:49 UTC · model grok-4.3
The pith
Three modifications to standard Physics-Informed Neural Networks recover accurate solutions for stiff coupled transport equations such as the Marshak wave.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The combination of a ScaledSigmoid final activation, a logarithmic MSE loss on initial and boundary conditions, and explicit enforcement of global conservation laws allows Physics-Informed Neural Networks to solve the Marshak wave equations while respecting physical bounds and matching reference Implicit Monte Carlo solutions across all regimes.
What carries the argument
The three modifications—ScaledSigmoid activation enforcing positivity and bounds, logarithmic MSE loss for conditions spanning up to 12 orders of magnitude, and an additional loss term enforcing global conservation laws—together with Monte Carlo point sampling and exponential time weighting.
If this is right
- Stiff hyperbolic systems with nonlinear coupling become solvable by the modified PINN framework.
- The approach extends to other radiative transport problems in engineering.
- Each of the three ingredients must be retained; omitting any one produces failure.
- Training completes in practical wall-clock times under 30 minutes for the benchmark problem.
Where Pith is reading between the lines
- The same bounded activation and logarithmic loss combination could be tested on other PDE systems that exhibit extreme scale separation in initial or boundary data.
- Conservation enforcement might be generalized to additional integral invariants beyond global mass or energy.
- The framework could be applied to time-dependent problems whose stiffness changes during evolution.
Load-bearing premise
The three modifications are each necessary and jointly sufficient to overcome the identified failure modes of standard PINNs on stiff transport systems.
What would settle it
Running the modified network on the Marshak wave benchmark and observing that the computed solution deviates qualitatively from the Implicit Monte Carlo reference in the hot region, cold region, or wavefront location would falsify the central claim.
Figures
read the original abstract
Purpose: Physics-Informed Neural Networks (PINNs) struggle with stiff, regime-changing transport equations due to instability, loss imbalance, and violations of physical consistency. This paper investigates these failures through the Marshak wave equations - a canonical benchmark from radiative transport - where initial and boundary conditions differ by up to 12 orders of magnitude, and proposes targeted modifications to the standard PINN framework to overcome them. Design/methodology/approach: Three modifications are introduced: (1) a ScaledSigmoid final activation enforcing physical bounds and positivity of the unknowns; (2) a logarithmic MSE loss replacing the standard quadratic loss for initial and boundary conditions, enabling training across extreme scale disparities; and (3) explicit enforcement of global conservation laws derived from the governing equations as an additional physics loss term. Monte Carlo sampling with exponential time weighting is used throughout. Findings: The proposed framework successfully recovers the Marshak wave dynamics - including the hot, cold, and wave-front regions - in agreement with a reference Implicit Monte Carlo solution, with run times under 30 minutes. Ablation studies confirm that each ingredient is essential: linear activation, absence of the logarithmic loss, or removal of the PDE term each independently cause the method to fail qualitatively. Originality/value: This work identifies and resolves three concrete failure modes of standard PINNs on stiff hyperbolic systems with nonlinear coupling. The combination of bounded activations, scale-aware loss functions, and conservation law enforcement constitutes a novel and practically validated framework, with applicability to radiative transport and other coupled stiff PDE systems in engineering.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that standard PINNs fail on stiff coupled transport systems such as the Marshak wave equations due to instability, loss imbalance across 12 orders of magnitude, and violations of physical consistency. It introduces three targeted modifications—ScaledSigmoid activation to enforce bounds and positivity, logarithmic MSE loss for initial/boundary conditions, and explicit global conservation enforcement as an additional physics term—combined with Monte Carlo sampling and exponential time weighting. The framework is reported to recover the hot, cold, and wave-front regions in qualitative agreement with an Implicit Monte Carlo reference, with run times under 30 minutes, and ablation studies are said to confirm that each modification is essential.
Significance. If the central claims hold with quantitative support, the work would provide a concrete, practically useful set of fixes for applying PINNs to stiff hyperbolic systems with nonlinear coupling in radiative transport. The agreement with an independent reference solution and the explicit identification of three failure modes constitute strengths. The absence of quantitative error metrics and ablation details, however, leaves the necessity and sufficiency claims difficult to evaluate at present.
major comments (2)
- [Findings] Findings section: the ablation studies report only that removing any one of the three modifications (ScaledSigmoid, logarithmic loss, or conservation term) causes qualitative failure, but supply no quantitative error metrics (L2, relative pointwise, or region-specific), no information on relative loss weightings among the logarithmic IC/BC term, PDE residual, and conservation term, and no confirmation that optimizer schedules or Monte-Carlo sampling densities were held fixed across ablations. This makes it impossible to distinguish genuine necessity from hyper-parameter sensitivity.
- [Abstract / Findings] Abstract and Findings: the reported success is described only in qualitative terms ("recovers the Marshak wave dynamics... in agreement with a reference Implicit Monte Carlo solution"); no numerical error measures, network architecture details, collocation-point counts, or verification that the conservation term is derived without approximation error are provided, weakening the ability to assess the framework's accuracy and reproducibility.
minor comments (1)
- The manuscript would benefit from a table or figure panel that reports quantitative discrepancies (e.g., pointwise or integrated errors) between the PINN solution and the IMC reference in the hot, cold, and front regions.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We agree that additional quantitative details will strengthen the presentation and address each major comment below. We will incorporate the requested information in a revised version.
read point-by-point responses
-
Referee: [Findings] Findings section: the ablation studies report only that removing any one of the three modifications (ScaledSigmoid, logarithmic loss, or conservation term) causes qualitative failure, but supply no quantitative error metrics (L2, relative pointwise, or region-specific), no information on relative loss weightings among the logarithmic IC/BC term, PDE residual, and conservation term, and no confirmation that optimizer schedules or Monte-Carlo sampling densities were held fixed across ablations. This makes it impossible to distinguish genuine necessity from hyper-parameter sensitivity.
Authors: We agree that the current ablation results are presented only qualitatively and that this limits evaluation of necessity versus hyperparameter effects. In the revised manuscript we will add quantitative L2 and relative pointwise errors (both global and region-specific for hot, cold, and wavefront zones) for each ablation. We will also report the exact relative loss weightings among the logarithmic IC/BC term, PDE residual, and conservation term, and explicitly state that optimizer schedules and Monte-Carlo sampling densities were held fixed. These additions will appear in an expanded Findings section with a new table of metrics. revision: yes
-
Referee: [Abstract / Findings] Abstract and Findings: the reported success is described only in qualitative terms ("recovers the Marshak wave dynamics... in agreement with a reference Implicit Monte Carlo solution"); no numerical error measures, network architecture details, collocation-point counts, or verification that the conservation term is derived without approximation error are provided, weakening the ability to assess the framework's accuracy and reproducibility.
Authors: The manuscript currently emphasizes qualitative agreement because the primary demonstration is successful recovery of all three physical regimes in a stiff system where standard PINNs fail. To improve reproducibility we will add numerical error measures (maximum relative error and L2 norms versus the IMC reference) to both the Abstract and Findings. Network architecture (layers and neurons per layer), total collocation-point counts, and the derivation of the conservation term will be supplied; the term is obtained by exact spatial integration of the governing equations and therefore carries no additional approximation beyond the quadrature used for the loss. These details will be inserted in the revised text. revision: yes
Circularity Check
No circularity: results validated against independent external reference
full rationale
The paper introduces three modifications to standard PINNs (ScaledSigmoid activation, logarithmic MSE loss, and explicit global conservation enforcement) and reports that the resulting model recovers Marshak wave dynamics in agreement with a reference Implicit Monte Carlo solution. No derivation step reduces by construction to a fitted quantity defined inside the method; the central claims rest on comparison to an external benchmark rather than self-referential fitting or self-citation chains. Ablation studies are described only qualitatively, but this is a methodological reporting issue, not a circular reduction of the claimed result to its inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Marshak wave equations admit global conservation laws that can be explicitly enforced as an additional loss term without introducing inconsistency with the PDE residual.
Reference graph
Works this paper leans on
-
[1]
Physics-informed neu- ral networks: A deep learning framework for solving forward and inverse problems involving partial differential equations,
M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neu- ral networks: A deep learning framework for solving forward and inverse problems involving partial differential equations,”Journal of Computa- tional Physics, vol. 378, pp. 686–707, 2019
2019
-
[2]
Physics-informed machine learning,
G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang, “Physics-informed machine learning,”Nature Reviews Physics, vol. 3, no. 6, pp. 422–440, 2021
2021
-
[3]
Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data,
L. Sun, H. Gao, S. Pan, and J. Wang, “Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data,” Computer Methods in Applied Mechanics and Engineering, vol. 361, p. 112732, 2020
2020
-
[4]
Adaptive ac- tivation functions accelerate convergence in deep and physics-informed neural networks,
A. D. Jagtap, K. Kawaguchi, and G. E. Karniadakis, “Adaptive ac- tivation functions accelerate convergence in deep and physics-informed neural networks,”Journal of Computational Physics, vol. 404, p. 109136, 2020. 18
2020
-
[5]
Hidden physics models: Machine learning of nonlinear partial differential equations,
M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Hidden physics models: Machine learning of nonlinear partial differential equations,”Journal of Computational Physics, vol. 357, pp. 125–141, 2018
2018
-
[6]
B-pinns: Bayesian physics- informed neural networks for forward and inverse pde problems with noisydata,
L. Yang, X. Meng, and G. E. Karniadakis, “B-pinns: Bayesian physics- informed neural networks for forward and inverse pde problems with noisydata,”Journal of Computational Physics, vol.425, p.109913, 2021
2021
-
[7]
When and why pinns fail to train: A neural tangent kernel perspective,
S. Wang, Y. Teng, and P. Perdikaris, “When and why pinns fail to train: A neural tangent kernel perspective,”Journal of Computational Physics, vol. 449, p. 110768, 2022
2022
-
[8]
Learning nonlinear operators via deeponet based on the universal approximation theorem of operators,
L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, “Learning nonlinear operators via deeponet based on the universal approximation theorem of operators,”Nature Machine Intelligence, vol. 3, no. 3, pp. 218–229, 2021
2021
-
[9]
Fourier Neural Operator for Parametric Partial Differential Equations
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, and A. Anandkumar, “Fourier neural operator for parametric partial differential equations,”arXiv preprint arXiv:2010.08895, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[10]
Surrogate modeling for neutron diffusion problems based on conservative physics- informed neural networks with boundary conditions enforcement,
J. Wang, X. Peng, Z. Chen, B. Zhou, Y. Zhou, and N. Zhou, “Surrogate modeling for neutron diffusion problems based on conservative physics- informed neural networks with boundary conditions enforcement,” Annals of Nuclear Energy, vol. 176, p. 109234, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0306454922002699
2022
-
[11]
H. Liang, Z. Song, C. Zhao, and X. Bian, “Continuous and discontinuous compressible flows in a converging–diverging channel solved by physics-informed neural networks without exogenous data,” Scientific Reports, vol. 14, no. 1, p. 3822, Feb. 2024. [Online]. Available: https://doi.org/10.1038/s41598-024-53680-2
-
[12]
Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks
S. Wang, Y. Teng, and P. Perdikaris, “Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks,” SIAM Journal on Scientific Computing, vol. 43, no. 5, pp. A3055– A3081, 2021. [Online]. Available: https://doi.org/10.1137/20M1318043
-
[13]
SIAM Journal on Scientific Computing43(6), B1105–B1132 (2021)
L. Lu, R. Pestourie, W. Yao, Z. Wang, F. Verdugo, and S. G. Johnson, “Physics-informed neural networks with hard 19 constraints for inverse design,”SIAM Journal on Scientific Computing, vol. 43, no. 6, pp. B1105–B1132, 2021. [Online]. Available: https://doi.org/10.1137/21M1397908
-
[14]
From PINNs to PIKANs: Recent advances in physics-informed machine learning,
J. D. Toscano, V. Oommen, A. J. Varghese, Z. Zou, N. A. Daryakenari, C. Wu, and G. E. Karniadakis, “From PINNs to PIKANs: Recent advances in physics-informed machine learning,” 2024. [Online]. Available: https://arxiv.org/abs/2410.13228
-
[15]
Effect of radiation on shock wave behavior,
R. E. Marshak, “Effect of radiation on shock wave behavior,”Physics of Fluids, vol. 1, no. 1, pp. 24–29, 1958
1958
-
[16]
The effects of slope limiting on asymptotic-preserving numerical methods for hyperbolic conservation laws,
R. G. McClarren and R. B. Lowrie, “The effects of slope limiting on asymptotic-preserving numerical methods for hyperbolic conservation laws,”Journal of Computational Physics, vol. 227, no. 23, pp. 9711– 9726, 2008
2008
-
[17]
The Quantization Monte Carlo method for solving radiative transport equations,
L. Laguzet and G. Turinici, “The Quantization Monte Carlo method for solving radiative transport equations,” Journal of Quantitative Spectroscopy and Radiative Trans- fer, vol. 329, p. 109178, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0022407324002851
2024
-
[18]
An Asymptotic-Preserving IMEX Method for Nonlinear Radiative Transfer Equation,
J. Fu, W. Li, P. Song, and Y. Wang, “An Asymptotic-Preserving IMEX Method for Nonlinear Radiative Transfer Equation,”Journal of Scientific Computing, vol. 92, no. 1, p. 27, Jun. 2022. [Online]. Available: https://doi.org/10.1007/s10915-022-01870-3
-
[19]
An asymptotic preserving unified gas kinetic scheme for gray radiative transfer equations,
W. Sun, S. Jiang, and K. Xu, “An asymptotic preserving unified gas kinetic scheme for gray radiative transfer equations,”Journal of Computational Physics, vol. 285, pp. 265–279, 2015. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0021999115000121
2015
-
[20]
High or- der asymptotic preserving discontinuous galerkin methods for gray radiative transfer equations,
T. Xiong, W. Sun, Y. Shi, and P. Song, “High or- der asymptotic preserving discontinuous galerkin methods for gray radiative transfer equations,”Journal of Computational Physics, vol. 463, p. 111308, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0021999122003709
2022
-
[21]
Solution of the transport equation by the Sn method,
B. Carlson and G. Bell, “Solution of the transport equation by the Sn method,” Los Alamos Scientific Lab., N. Mex., Tech. Rep., 1958. 20
1958
-
[22]
Optimal Time Sampling in Physics-Informed Neural Net- works,
G. Turinici, “Optimal Time Sampling in Physics-Informed Neural Net- works,” inPattern Recognition, A. Antonacopoulos, S. Chaudhuri, R. Chellappa, C.-L. Liu, S. Bhattacharya, and U. Pal, Eds. Cham: Springer Nature Switzerland, 2025, pp. 218–233
2025
-
[23]
Regime-aware time weighting for physics-informed neu- ral networks,
——, “Regime-aware time weighting for physics-informed neu- ral networks,”Journal of Computational and Applied Math- ematics, vol. 473, p. 116858, 2026. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0377042725003723
2026
-
[24]
Four decades of implicit Monte Carlo,
A. B. Wollaber, “Four decades of implicit Monte Carlo,”Journal of Computational and Theoretical Transport, 02 2016. [Online]. Available: https://www.osti.gov/biblio/1255843
-
[25]
J. Fleck and J. Cummings, “An implicit Monte Carlo scheme for calculating time and frequency dependent non- linear radiation transport,”Journal of Computational Physics, vol. 8, no. 3, pp. 313–342, 1971. [Online]. Available: https://www.sciencedirect.com/science/article/pii/0021999171900155 21
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.