pith. sign in

arxiv: 2606.21236 · v1 · pith:Z45RQM2Inew · submitted 2026-06-19 · ⚛️ physics.comp-ph · cs.NA· math.NA· math.OC· physics.flu-dyn

Physics-Informed Neural Networks for coupled stiff transport systems

Pith reviewed 2026-06-26 12:49 UTC · model grok-4.3

classification ⚛️ physics.comp-ph cs.NAmath.NAmath.OCphysics.flu-dyn
keywords physics-informed neural networksstiff transport equationsMarshak waveradiative transportconservation lawsactivation functionslogarithmic loss
0
0 comments X

The pith

Three modifications to standard Physics-Informed Neural Networks recover accurate solutions for stiff coupled transport equations such as the Marshak wave.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that standard PINNs fail on stiff transport problems because of instability, loss imbalance across many orders of magnitude, and violation of conservation. It introduces a ScaledSigmoid activation to keep solutions positive and bounded, replaces the usual squared loss with a logarithmic version for initial and boundary data, and adds an explicit global conservation penalty derived from the governing equations. Monte Carlo sampling with exponential time weighting is used for training. The resulting network reproduces the hot region, cold region, and moving wavefront of the Marshak wave in agreement with an Implicit Monte Carlo reference, completing training in under 30 minutes. Ablation experiments indicate that removing any one of the three changes produces qualitative failure.

Core claim

The combination of a ScaledSigmoid final activation, a logarithmic MSE loss on initial and boundary conditions, and explicit enforcement of global conservation laws allows Physics-Informed Neural Networks to solve the Marshak wave equations while respecting physical bounds and matching reference Implicit Monte Carlo solutions across all regimes.

What carries the argument

The three modifications—ScaledSigmoid activation enforcing positivity and bounds, logarithmic MSE loss for conditions spanning up to 12 orders of magnitude, and an additional loss term enforcing global conservation laws—together with Monte Carlo point sampling and exponential time weighting.

If this is right

  • Stiff hyperbolic systems with nonlinear coupling become solvable by the modified PINN framework.
  • The approach extends to other radiative transport problems in engineering.
  • Each of the three ingredients must be retained; omitting any one produces failure.
  • Training completes in practical wall-clock times under 30 minutes for the benchmark problem.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bounded activation and logarithmic loss combination could be tested on other PDE systems that exhibit extreme scale separation in initial or boundary data.
  • Conservation enforcement might be generalized to additional integral invariants beyond global mass or energy.
  • The framework could be applied to time-dependent problems whose stiffness changes during evolution.

Load-bearing premise

The three modifications are each necessary and jointly sufficient to overcome the identified failure modes of standard PINNs on stiff transport systems.

What would settle it

Running the modified network on the Marshak wave benchmark and observing that the computed solution deviates qualitatively from the Implicit Monte Carlo reference in the hot region, cold region, or wavefront location would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.21236 by Gabriel Turinici, Laetitia Laguzet.

Figure 1
Figure 1. Figure 1: Illustration of a neural network generating the solution as a function (to be learned) of the inputs. The advantage is that standard “deep learning” libraries can easily compute the derivatives of the output with respect to the inputs and also with respect to the network parameters. 3 Stiff hyperbolic systems: the Marshak wave example Although PINN were seen to obtain good results in general, the situation… view at source ↗
Figure 2
Figure 2. Figure 2: Actual architecture of the neural network. both u and T by a specific part. As it is classic in PINN we also added an embedding, here a polynomial embedding of order 2 which means that the input t, x is transformed into a vector t, x, t2 , tx, x2 ; combined with a fully connected architecture this gives the layout (see [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Reference solution for the problem (11)-(17) obtained by the Im [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Execution result after niter = 10k iterations using the Adam opti￾mizer (learning rate and other parameters are PyTorch defaults). Left: the Marshak profiles at different time instances. Right: the complete T˜ evolu￾tion, time is the ordinate. We note that the t, x space is clearly separated in three regions: ’hot’ (yellow), ’wave’ (red) and ’cold’ (blue). 5.1 Nominal results We present in [PITH_FULL_IMAG… view at source ↗
Figure 5
Figure 5. Figure 5: Convergence evolution for the result in Figure 4; we plot the profiles [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Initial conditions and left boundary conditions for the result in [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Same as Figure 4 except that in each case we take out a single [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Left boundary conditions when replacing the MSE log loss with [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗
read the original abstract

Purpose: Physics-Informed Neural Networks (PINNs) struggle with stiff, regime-changing transport equations due to instability, loss imbalance, and violations of physical consistency. This paper investigates these failures through the Marshak wave equations - a canonical benchmark from radiative transport - where initial and boundary conditions differ by up to 12 orders of magnitude, and proposes targeted modifications to the standard PINN framework to overcome them. Design/methodology/approach: Three modifications are introduced: (1) a ScaledSigmoid final activation enforcing physical bounds and positivity of the unknowns; (2) a logarithmic MSE loss replacing the standard quadratic loss for initial and boundary conditions, enabling training across extreme scale disparities; and (3) explicit enforcement of global conservation laws derived from the governing equations as an additional physics loss term. Monte Carlo sampling with exponential time weighting is used throughout. Findings: The proposed framework successfully recovers the Marshak wave dynamics - including the hot, cold, and wave-front regions - in agreement with a reference Implicit Monte Carlo solution, with run times under 30 minutes. Ablation studies confirm that each ingredient is essential: linear activation, absence of the logarithmic loss, or removal of the PDE term each independently cause the method to fail qualitatively. Originality/value: This work identifies and resolves three concrete failure modes of standard PINNs on stiff hyperbolic systems with nonlinear coupling. The combination of bounded activations, scale-aware loss functions, and conservation law enforcement constitutes a novel and practically validated framework, with applicability to radiative transport and other coupled stiff PDE systems in engineering.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that standard PINNs fail on stiff coupled transport systems such as the Marshak wave equations due to instability, loss imbalance across 12 orders of magnitude, and violations of physical consistency. It introduces three targeted modifications—ScaledSigmoid activation to enforce bounds and positivity, logarithmic MSE loss for initial/boundary conditions, and explicit global conservation enforcement as an additional physics term—combined with Monte Carlo sampling and exponential time weighting. The framework is reported to recover the hot, cold, and wave-front regions in qualitative agreement with an Implicit Monte Carlo reference, with run times under 30 minutes, and ablation studies are said to confirm that each modification is essential.

Significance. If the central claims hold with quantitative support, the work would provide a concrete, practically useful set of fixes for applying PINNs to stiff hyperbolic systems with nonlinear coupling in radiative transport. The agreement with an independent reference solution and the explicit identification of three failure modes constitute strengths. The absence of quantitative error metrics and ablation details, however, leaves the necessity and sufficiency claims difficult to evaluate at present.

major comments (2)
  1. [Findings] Findings section: the ablation studies report only that removing any one of the three modifications (ScaledSigmoid, logarithmic loss, or conservation term) causes qualitative failure, but supply no quantitative error metrics (L2, relative pointwise, or region-specific), no information on relative loss weightings among the logarithmic IC/BC term, PDE residual, and conservation term, and no confirmation that optimizer schedules or Monte-Carlo sampling densities were held fixed across ablations. This makes it impossible to distinguish genuine necessity from hyper-parameter sensitivity.
  2. [Abstract / Findings] Abstract and Findings: the reported success is described only in qualitative terms ("recovers the Marshak wave dynamics... in agreement with a reference Implicit Monte Carlo solution"); no numerical error measures, network architecture details, collocation-point counts, or verification that the conservation term is derived without approximation error are provided, weakening the ability to assess the framework's accuracy and reproducibility.
minor comments (1)
  1. The manuscript would benefit from a table or figure panel that reports quantitative discrepancies (e.g., pointwise or integrated errors) between the PINN solution and the IMC reference in the hot, cold, and front regions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We agree that additional quantitative details will strengthen the presentation and address each major comment below. We will incorporate the requested information in a revised version.

read point-by-point responses
  1. Referee: [Findings] Findings section: the ablation studies report only that removing any one of the three modifications (ScaledSigmoid, logarithmic loss, or conservation term) causes qualitative failure, but supply no quantitative error metrics (L2, relative pointwise, or region-specific), no information on relative loss weightings among the logarithmic IC/BC term, PDE residual, and conservation term, and no confirmation that optimizer schedules or Monte-Carlo sampling densities were held fixed across ablations. This makes it impossible to distinguish genuine necessity from hyper-parameter sensitivity.

    Authors: We agree that the current ablation results are presented only qualitatively and that this limits evaluation of necessity versus hyperparameter effects. In the revised manuscript we will add quantitative L2 and relative pointwise errors (both global and region-specific for hot, cold, and wavefront zones) for each ablation. We will also report the exact relative loss weightings among the logarithmic IC/BC term, PDE residual, and conservation term, and explicitly state that optimizer schedules and Monte-Carlo sampling densities were held fixed. These additions will appear in an expanded Findings section with a new table of metrics. revision: yes

  2. Referee: [Abstract / Findings] Abstract and Findings: the reported success is described only in qualitative terms ("recovers the Marshak wave dynamics... in agreement with a reference Implicit Monte Carlo solution"); no numerical error measures, network architecture details, collocation-point counts, or verification that the conservation term is derived without approximation error are provided, weakening the ability to assess the framework's accuracy and reproducibility.

    Authors: The manuscript currently emphasizes qualitative agreement because the primary demonstration is successful recovery of all three physical regimes in a stiff system where standard PINNs fail. To improve reproducibility we will add numerical error measures (maximum relative error and L2 norms versus the IMC reference) to both the Abstract and Findings. Network architecture (layers and neurons per layer), total collocation-point counts, and the derivation of the conservation term will be supplied; the term is obtained by exact spatial integration of the governing equations and therefore carries no additional approximation beyond the quadrature used for the loss. These details will be inserted in the revised text. revision: yes

Circularity Check

0 steps flagged

No circularity: results validated against independent external reference

full rationale

The paper introduces three modifications to standard PINNs (ScaledSigmoid activation, logarithmic MSE loss, and explicit global conservation enforcement) and reports that the resulting model recovers Marshak wave dynamics in agreement with a reference Implicit Monte Carlo solution. No derivation step reduces by construction to a fitted quantity defined inside the method; the central claims rest on comparison to an external benchmark rather than self-referential fitting or self-citation chains. Ablation studies are described only qualitatively, but this is a methodological reporting issue, not a circular reduction of the claimed result to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into exact assumptions; the claim rests on the domain equations admitting easily computable global conservation laws and on the reference Monte Carlo solution being treated as ground truth.

axioms (1)
  • domain assumption The Marshak wave equations admit global conservation laws that can be explicitly enforced as an additional loss term without introducing inconsistency with the PDE residual.
    Invoked when adding the conservation term to the physics loss.

pith-pipeline@v0.9.1-grok · 5823 in / 1380 out tokens · 33316 ms · 2026-06-26T12:49:23.675120+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 8 canonical work pages · 1 internal anchor

  1. [1]

    Physics-informed neu- ral networks: A deep learning framework for solving forward and inverse problems involving partial differential equations,

    M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neu- ral networks: A deep learning framework for solving forward and inverse problems involving partial differential equations,”Journal of Computa- tional Physics, vol. 378, pp. 686–707, 2019

  2. [2]

    Physics-informed machine learning,

    G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang, “Physics-informed machine learning,”Nature Reviews Physics, vol. 3, no. 6, pp. 422–440, 2021

  3. [3]

    Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data,

    L. Sun, H. Gao, S. Pan, and J. Wang, “Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data,” Computer Methods in Applied Mechanics and Engineering, vol. 361, p. 112732, 2020

  4. [4]

    Adaptive ac- tivation functions accelerate convergence in deep and physics-informed neural networks,

    A. D. Jagtap, K. Kawaguchi, and G. E. Karniadakis, “Adaptive ac- tivation functions accelerate convergence in deep and physics-informed neural networks,”Journal of Computational Physics, vol. 404, p. 109136, 2020. 18

  5. [5]

    Hidden physics models: Machine learning of nonlinear partial differential equations,

    M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Hidden physics models: Machine learning of nonlinear partial differential equations,”Journal of Computational Physics, vol. 357, pp. 125–141, 2018

  6. [6]

    B-pinns: Bayesian physics- informed neural networks for forward and inverse pde problems with noisydata,

    L. Yang, X. Meng, and G. E. Karniadakis, “B-pinns: Bayesian physics- informed neural networks for forward and inverse pde problems with noisydata,”Journal of Computational Physics, vol.425, p.109913, 2021

  7. [7]

    When and why pinns fail to train: A neural tangent kernel perspective,

    S. Wang, Y. Teng, and P. Perdikaris, “When and why pinns fail to train: A neural tangent kernel perspective,”Journal of Computational Physics, vol. 449, p. 110768, 2022

  8. [8]

    Learning nonlinear operators via deeponet based on the universal approximation theorem of operators,

    L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, “Learning nonlinear operators via deeponet based on the universal approximation theorem of operators,”Nature Machine Intelligence, vol. 3, no. 3, pp. 218–229, 2021

  9. [9]

    Fourier Neural Operator for Parametric Partial Differential Equations

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, and A. Anandkumar, “Fourier neural operator for parametric partial differential equations,”arXiv preprint arXiv:2010.08895, 2021

  10. [10]

    Surrogate modeling for neutron diffusion problems based on conservative physics- informed neural networks with boundary conditions enforcement,

    J. Wang, X. Peng, Z. Chen, B. Zhou, Y. Zhou, and N. Zhou, “Surrogate modeling for neutron diffusion problems based on conservative physics- informed neural networks with boundary conditions enforcement,” Annals of Nuclear Energy, vol. 176, p. 109234, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0306454922002699

  11. [11]

    Continuous and discontinuous compressible flows in a converging–diverging channel solved by physics-informed neural networks without exogenous data,

    H. Liang, Z. Song, C. Zhao, and X. Bian, “Continuous and discontinuous compressible flows in a converging–diverging channel solved by physics-informed neural networks without exogenous data,” Scientific Reports, vol. 14, no. 1, p. 3822, Feb. 2024. [Online]. Available: https://doi.org/10.1038/s41598-024-53680-2

  12. [12]

    Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks

    S. Wang, Y. Teng, and P. Perdikaris, “Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks,” SIAM Journal on Scientific Computing, vol. 43, no. 5, pp. A3055– A3081, 2021. [Online]. Available: https://doi.org/10.1137/20M1318043

  13. [13]

    SIAM Journal on Scientific Computing43(6), B1105–B1132 (2021)

    L. Lu, R. Pestourie, W. Yao, Z. Wang, F. Verdugo, and S. G. Johnson, “Physics-informed neural networks with hard 19 constraints for inverse design,”SIAM Journal on Scientific Computing, vol. 43, no. 6, pp. B1105–B1132, 2021. [Online]. Available: https://doi.org/10.1137/21M1397908

  14. [14]

    From PINNs to PIKANs: Recent advances in physics-informed machine learning,

    J. D. Toscano, V. Oommen, A. J. Varghese, Z. Zou, N. A. Daryakenari, C. Wu, and G. E. Karniadakis, “From PINNs to PIKANs: Recent advances in physics-informed machine learning,” 2024. [Online]. Available: https://arxiv.org/abs/2410.13228

  15. [15]

    Effect of radiation on shock wave behavior,

    R. E. Marshak, “Effect of radiation on shock wave behavior,”Physics of Fluids, vol. 1, no. 1, pp. 24–29, 1958

  16. [16]

    The effects of slope limiting on asymptotic-preserving numerical methods for hyperbolic conservation laws,

    R. G. McClarren and R. B. Lowrie, “The effects of slope limiting on asymptotic-preserving numerical methods for hyperbolic conservation laws,”Journal of Computational Physics, vol. 227, no. 23, pp. 9711– 9726, 2008

  17. [17]

    The Quantization Monte Carlo method for solving radiative transport equations,

    L. Laguzet and G. Turinici, “The Quantization Monte Carlo method for solving radiative transport equations,” Journal of Quantitative Spectroscopy and Radiative Trans- fer, vol. 329, p. 109178, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0022407324002851

  18. [18]

    An Asymptotic-Preserving IMEX Method for Nonlinear Radiative Transfer Equation,

    J. Fu, W. Li, P. Song, and Y. Wang, “An Asymptotic-Preserving IMEX Method for Nonlinear Radiative Transfer Equation,”Journal of Scientific Computing, vol. 92, no. 1, p. 27, Jun. 2022. [Online]. Available: https://doi.org/10.1007/s10915-022-01870-3

  19. [19]

    An asymptotic preserving unified gas kinetic scheme for gray radiative transfer equations,

    W. Sun, S. Jiang, and K. Xu, “An asymptotic preserving unified gas kinetic scheme for gray radiative transfer equations,”Journal of Computational Physics, vol. 285, pp. 265–279, 2015. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0021999115000121

  20. [20]

    High or- der asymptotic preserving discontinuous galerkin methods for gray radiative transfer equations,

    T. Xiong, W. Sun, Y. Shi, and P. Song, “High or- der asymptotic preserving discontinuous galerkin methods for gray radiative transfer equations,”Journal of Computational Physics, vol. 463, p. 111308, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0021999122003709

  21. [21]

    Solution of the transport equation by the Sn method,

    B. Carlson and G. Bell, “Solution of the transport equation by the Sn method,” Los Alamos Scientific Lab., N. Mex., Tech. Rep., 1958. 20

  22. [22]

    Optimal Time Sampling in Physics-Informed Neural Net- works,

    G. Turinici, “Optimal Time Sampling in Physics-Informed Neural Net- works,” inPattern Recognition, A. Antonacopoulos, S. Chaudhuri, R. Chellappa, C.-L. Liu, S. Bhattacharya, and U. Pal, Eds. Cham: Springer Nature Switzerland, 2025, pp. 218–233

  23. [23]

    Regime-aware time weighting for physics-informed neu- ral networks,

    ——, “Regime-aware time weighting for physics-informed neu- ral networks,”Journal of Computational and Applied Math- ematics, vol. 473, p. 116858, 2026. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0377042725003723

  24. [24]

    Four decades of implicit Monte Carlo,

    A. B. Wollaber, “Four decades of implicit Monte Carlo,”Journal of Computational and Theoretical Transport, 02 2016. [Online]. Available: https://www.osti.gov/biblio/1255843

  25. [25]

    An implicit Monte Carlo scheme for calculating time and frequency dependent non- linear radiation transport,

    J. Fleck and J. Cummings, “An implicit Monte Carlo scheme for calculating time and frequency dependent non- linear radiation transport,”Journal of Computational Physics, vol. 8, no. 3, pp. 313–342, 1971. [Online]. Available: https://www.sciencedirect.com/science/article/pii/0021999171900155 21