A neural network method for scalar conservation laws with convergence rates for shock-wave solutions
Pith reviewed 2026-05-21 01:08 UTC · model grok-4.3
The pith
A neural network method for scalar conservation laws achieves explicit L1 convergence rates for solutions with shocks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For piecewise smooth entropy solutions of scalar conservation laws that include shocks, rarefactions, compound waves, regular interactions, and nondegenerate shock formation, there exist explicit tanh neural networks such that minimizers of the proposed loss satisfy explicit L1 convergence rates; when network size grows proportionally to the number of degrees of freedom of a space-time mesh of size h, the analysis recovers the classical Kuznetsov rate O(h^{1/2}) in shock-dominated cases.
What carries the argument
A computable approximation of the Kružkov entropy residual that sits between the strong and weak forms of the entropy inequality, paired with explicit constructions of tanh neural networks that approximate shock-adapted continuous piecewise linear functions.
If this is right
- Minimizers of the entropy residual loss satisfy rigorous L1 error bounds for solutions that contain shocks and other discontinuities.
- The method recovers the classical Kuznetsov rate O(h^{1/2}) whenever network size scales with the degrees of freedom in a space-time mesh of width h.
- The construction applies to one and two space dimensions and covers rarefactions, compound waves, regular shock interactions, and nondegenerate shock formation from smooth data.
- Numerical experiments in one and two dimensions support the derived rates and indicate that observed accuracy can exceed the guaranteed rate.
Where Pith is reading between the lines
- Similar entropy-residual losses could be tested on systems of conservation laws where piecewise-smooth solutions are still available.
- The explicit network constructions suggest a route to making physics-informed neural networks fully rigorous for other hyperbolic problems by importing classical approximation theory.
- One could check whether the same scaling of network size with mesh degrees of freedom produces comparable rates for problems whose shocks interact in ways excluded from the current analysis.
Load-bearing premise
The entropy solution is piecewise smooth so that shock-adapted continuous piecewise linear functions exist and tanh neural networks can approximate them at known rates.
What would settle it
A concrete piecewise smooth entropy solution containing a shock for which the L1 error of a trained network minimizer fails to decay like O(h^{1/2}) when network size is increased in proportion to the degrees of freedom of a mesh of width h.
read the original abstract
We propose a new entropy-compatible neural network method for scalar hyperbolic conservation laws and establish, to our knowledge, the first explicit \(L^1\) convergence rates in this setting that apply to piecewise smooth entropy solutions, including those with discontinuities. The method is based on a computable approximation of the Kru\v{z}kov entropy residual that sits between the strong and weak forms of the entropy inequality. For piecewise smooth entropy solutions containing shocks, rarefactions, compound waves, regular shock interactions, and, in one space dimension, nondegenerate shock formation from smooth initial data, we construct explicit neural networks with provably small loss by combining shock-adapted continuous piecewise linear functions with known approximation properties of \(\tanh\) neural networks. Together with entropy-based stability estimates, this gives rigorous \(L^1\) error bounds for minimizers of the proposed loss. In particular, when the network size grows in proportion to the number of degrees of freedom of a space--time mesh of size \(h\), the analysis recovers the classical Kuznetsov rate \(O(h^{1/2})\) in shock-dominated cases. Numerical experiments in one and two space dimensions support the theory and suggest that the actual accuracy of the method can be better than the rate guaranteed by the analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an entropy-compatible neural network method for scalar hyperbolic conservation laws based on a computable approximation of the Kružkov entropy residual. For piecewise smooth entropy solutions (including shocks, rarefactions, compound waves, and shock formation), it constructs explicit tanh neural networks with provably small loss by combining shock-adapted continuous piecewise linear functions with known tanh approximation properties. Entropy stability estimates then yield rigorous L¹ error bounds for the loss minimizers; when network size scales with the degrees of freedom of an h-mesh, the classical Kuznetsov rate O(h^{1/2}) is recovered in shock-dominated regimes. Numerical experiments in 1D and 2D are provided to support the theory.
Significance. If the central claims hold, the work is significant for providing the first explicit L¹ convergence rates for neural-network approximations of scalar conservation laws that cover discontinuous entropy solutions. The explicit network construction, combined with standard entropy estimates to recover the Kuznetsov rate without fitted parameters, supplies a concrete theoretical bridge between scientific machine learning and classical numerical analysis for hyperbolic PDEs. This is a clear strength for a field where rigorous rates for discontinuous solutions have been scarce.
major comments (1)
- [explicit neural networks for piecewise smooth solutions] Section on explicit neural networks for piecewise smooth solutions: the claim that tanh networks achieve an O(h) entropy residual for shock-adapted piecewise-linear approximants requires the approximation constants to be independent of shock strength and speed; the current sketch leaves open whether these constants remain uniform when the number of shocks grows with 1/h, which would affect the parameter scaling needed for the O(h^{1/2}) rate.
minor comments (3)
- [Abstract] The abstract states that the method recovers the Kuznetsov rate 'in shock-dominated cases'; a brief remark clarifying what 'shock-dominated' means quantitatively (e.g., total variation or number of shocks relative to mesh size) would help readers.
- Notation for the entropy residual (between strong and weak forms) is introduced without an equation number in the main text; adding a displayed equation would improve readability when the residual is later bounded.
- [numerical experiments] In the numerical experiments section, the reported L¹ errors for the 2D examples should include a direct comparison table against a standard finite-volume scheme on the same meshes to make the practical advantage clearer.
Simulated Author's Rebuttal
We thank the referee for the careful reading, the positive overall assessment, and the recommendation for minor revision. The single major comment is addressed point by point below.
read point-by-point responses
-
Referee: [explicit neural networks for piecewise smooth solutions] Section on explicit neural networks for piecewise smooth solutions: the claim that tanh networks achieve an O(h) entropy residual for shock-adapted piecewise-linear approximants requires the approximation constants to be independent of shock strength and speed; the current sketch leaves open whether these constants remain uniform when the number of shocks grows with 1/h, which would affect the parameter scaling needed for the O(h^{1/2}) rate.
Authors: We thank the referee for this precise observation. In the construction, each shock is approximated by a continuous piecewise-linear transition layer of width proportional to h whose local Lipschitz constant is bounded by the jump size divided by h. Because the total variation of entropy solutions is uniformly bounded, the sum of these local Lipschitz constants over all shocks remains O(1/h). Standard quantitative approximation results for tanh networks then show that the number of neurons required to achieve an O(h) entropy-residual contribution per transition scales linearly with the local Lipschitz constant; consequently the aggregate network size remains proportional to the number of degrees of freedom of an h-mesh. The approximation constants themselves depend only on the uniform bounds furnished by the Kružkov theory (maximum wave speed and total variation) and are therefore independent of individual shock strengths and speeds. We will insert a short clarifying paragraph together with a reference to the relevant quantitative tanh-approximation lemma in the revised manuscript to make this uniformity explicit. revision: yes
Circularity Check
No significant circularity detected
full rationale
The derivation proceeds from an explicit construction of shock-adapted continuous piecewise-linear approximants for piecewise-smooth entropy solutions, combined with established approximation rates for tanh networks and standard Kružkov entropy residual estimates, to obtain L1 error bounds that recover the classical Kuznetsov O(h^{1/2}) rate when network size scales with mesh degrees of freedom. No step reduces by definition to its own output, renames a fitted quantity as a prediction, or relies on a load-bearing self-citation whose validity is internal to the paper; all load-bearing ingredients are drawn from external approximation theory and classical PDE stability results.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Kružkov entropy inequality holds for the scalar conservation law
- standard math tanh neural networks approximate continuous piecewise linear functions with known rates
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce an entropy-compatible neural network method ... based on a computable surrogate of the Kružkov entropy residual ... construct explicit neural networks with provably small loss by combining shock-adapted continuous piecewise linear functions with known approximation properties of tanh neural networks.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Harten, A., Engquist, B., Osher, S., Chakravarthy, S.R.: Uniformly high order accurate essentially non-oscillatory schemes. III. Journal of Computational Physics 71(2), 231–303 (1987)
work page 1987
-
[2]
Journal of Computational Physics 115(1), 200–212 (1994)
Liu, X.-D., Osher, S., Chan, T.: Weighted essentially non-oscillatory schemes. Journal of Computational Physics 115(1), 200–212 (1994)
work page 1994
-
[3]
Journal of Computational Physics 126, 202–228 (1996)
Jiang, G.-S., Shu, C.-W.: Efficient implementation of weighted ENO schemes. Journal of Computational Physics 126, 202–228 (1996)
work page 1996
-
[4]
Journal of Scientific Computing 16(3), 173–261 (2001)
Cockburn, B., Shu, C.-W.: Runge–Kutta discontinuous Galerkin methods for convection- dominated problems. Journal of Scientific Computing 16(3), 173–261 (2001)
work page 2001
-
[5]
In: Handbook of Numerical Analysis vol
Tadmor, E.: Entropy stable schemes. In: Handbook of Numerical Analysis vol. 17, pp. 467–493. Elsevier, Amsterdam (2016)
work page 2016
-
[6]
Mathematics of the USSR-Sbornik 10(2), 217 (1970)
Kružkov, S.N.: First order quasilinear equations in several independent variables. Mathematics of the USSR-Sbornik 10(2), 217 (1970)
work page 1970
-
[7]
USSR Computational Mathematics and Mathematical Physics 16(6), 105–119 (1976)
Kuznetsov, N.: Accuracy of some approximate methods for computing the weak solutions of a first-order quasi-linear equation. USSR Computational Mathematics and Mathematical Physics 16(6), 105–119 (1976)
work page 1976
-
[8]
Mathematics of Computation 40(161), 91–106 (1983)
Sanders, R.: On convergence of monotone finite difference schemes with variable spatial differenc- ing. Mathematics of Computation 40(161), 91–106 (1983)
work page 1983
-
[9]
Cockburn, B., Gremaud, P.-A.: A priori error estimates for numerical methods for scalar con- servation laws. Part I: The general approach. Mathematics of Computation 65(214), 533–573 (1996) 25
work page 1996
-
[10]
SIAM Journal on Mathematical Analysis 34(6), 1300–1307 (2003)
Makridakis, C., Perthame, B.: Optimal rate of convergence for anisotropic vanishing viscosity limit of a scalar balance law. SIAM Journal on Mathematical Analysis 34(6), 1300–1307 (2003)
work page 2003
-
[11]
SIAM Journal on Numerical Analysis 29(6), 1505–1519 (1992)
Nessyahu, H., Tadmor, E.: The convergence rate of approximate solutions for nonlinear scalar conservation laws. SIAM Journal on Numerical Analysis 29(6), 1505–1519 (1992)
work page 1992
-
[12]
Archive for Rational Mechanics and Analysis 88(3), 223–270 (1985)
DiPerna, R.J.: Measure-valued solutions to conservation laws. Archive for Rational Mechanics and Analysis 88(3), 223–270 (1985)
work page 1985
-
[13]
Mathematics of Computation 53(188), 527–545 (1989)
Szepessy, A.: Convergence of a shock-capturing streamline diffusion finite element method for a scalar conservation law in two space dimensions. Mathematics of Computation 53(188), 527–545 (1989)
work page 1989
-
[14]
Journal of Computational Physics 378, 686–707 (2019)
Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks (PINNs): A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, 686–707 (2019)
work page 2019
-
[15]
Numerische Mathematik 157(6), 1975–2016 (2025)
Akrivis, G., Makridakis, C.G., Smaragdakis, C.: Runge–Kutta physics informed neural networks: formulation and analysis. Numerische Mathematik 157(6), 1975–2016 (2025)
work page 1975
-
[16]
Communications in Mathematics and Statistics 6(1), 1–12 (2018)
E, W., Yu, B.: The Deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics 6(1), 1–12 (2018)
work page 2018
-
[17]
SIAM Journal on Scientific Computing, 414–435 (2025)
Cai, Z., Doktorova, A., Falgout, R.D., Herrera, C.: Efficient Shallow Ritz Method for One- Dimensional Diffusion-Reaction Problems. SIAM Journal on Scientific Computing, 414–435 (2025)
work page 2025
-
[18]
In: Advances in Neural Information Processing Systems, vol
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A.M., Anandkumar, A.: Fourier neural operator for parametric partial differential equations. In: Advances in Neural Information Processing Systems, vol. 33, pp. 6755–6766 (2020)
work page 2020
-
[19]
Nature Machine Intelligence 3, 218– 229 (2021)
Lu, L., Jin, P., Pang, G., Zhang, Z., Karniadakis, G.E.: Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence 3, 218– 229 (2021)
work page 2021
-
[20]
Physical Review E 104(4), 045303 (2021)
Du, Y., Zaki, T.A.: Evolutional deep neural network. Physical Review E 104(4), 045303 (2021)
work page 2021
-
[21]
arXiv preprint arXiv:2403.19234 (2024)
Feischl, M., Lasser, C., Lubich, C., Nick, J.: Regularized dynamical parametric approximation. arXiv preprint arXiv:2403.19234 (2024)
-
[22]
arXiv preprint arXiv:2510.18266 (2025)
Su, H., Zhang, L., Zhao, J.: SPIKE: Stable Physics-Informed Kernel Evolution Method for Solving Hyperbolic Conservation Laws. arXiv preprint arXiv:2510.18266 (2025)
-
[23]
Acta Numerica 33, 633–713 (2024)
De Ryck, T., Mishra, S.: Numerical analysis of physics-informed neural networks and related models in physics-informed machine learning. Acta Numerica 33, 633–713 (2024)
work page 2024
-
[24]
IMA Journal of Numerical Analysis, 090 (2025)
Gazoulis, D., Gkanis, I., Makridakis, C.G.: On the stability and convergence of physics informed neural networks. IMA Journal of Numerical Analysis, 090 (2025)
work page 2025
-
[25]
The SMAI Journal of Computational Mathematics 10, 373–401 (2024)
Chaumet, A., Giesselmann, J.: Improving weak pinns for hyperbolic conservation laws: Dual norm computation, boundary conditions and systems. The SMAI Journal of Computational Mathematics 10, 373–401 (2024)
work page 2024
-
[26]
SIAM Journal on Scientific Computing 46(4), 448–478 (2024)
Cai, Z., Choi, J., Liu, M.: Least-squares neural network (LSNN) method for linear advection- reaction equation: Discontinuity interface. SIAM Journal on Scientific Computing 46(4), 448–478 (2024)
work page 2024
-
[27]
arXiv preprint arXiv:2601.20013 (2026) 26
Liu, M., Cai, Z.: Least-Squares Neural Network (LSNN) Method for Scalar Hyperbolic Partial Differential Equations. arXiv preprint arXiv:2601.20013 (2026) 26
-
[28]
Journal of Computational Physics 449, 110754 (2022)
Patel, R.G., Manickam, I., Trask, N.A., Wood, M.A., Lee, M., Tomas, I., Cyr, E.C.: Ther- modynamically consistent physics-informed neural networks for hyperbolic systems. Journal of Computational Physics 449, 110754 (2022)
work page 2022
-
[29]
Mathematics of Computation 93(350), 2643–2677 (2024)
De Ryck, T., Mishra, S.: Error analysis for deep neural network approximations of parametric hyperbolic conservation laws. Mathematics of Computation 93(350), 2643–2677 (2024)
work page 2024
-
[30]
arXiv preprint arXiv:2603.24819 (2026)
Oubarka, I., Kissami, I., Boubekeur, M., Benkhaldoun, F., Madrane, A., Saadi, Z.: Weak and entropy physics-informed neural networks for conservation laws. arXiv preprint arXiv:2603.24819 (2026)
-
[31]
Transactions of the American Mathematical Society 350(7), 2847–2870 (1998)
Bouchut, F., Perthame, B.: Kruzkov’s estimates for scalar conservation laws revisited. Transactions of the American Mathematical Society 350(7), 2847–2870 (1998)
work page 1998
-
[32]
SIAM Journal on Numerical Analysis 62(2), 811–841 (2024)
De Ryck, T., Mishra, S., Molinaro, R.: wPINNs: Weak physics informed neural networks for approximating entropy solutions of hyperbolic conservation laws. SIAM Journal on Numerical Analysis 62(2), 811–841 (2024)
work page 2024
-
[33]
SIAM Journal on Scientific Computing 25(4), 1382–1415 (2004)
Puppo, G.: Numerical entropy production for central schemes. SIAM Journal on Scientific Computing 25(4), 1382–1415 (2004)
work page 2004
-
[34]
Commu- nications in Computational Physics 10(5), 1132–1160 (2011)
Puppo, G., Semplice, M.: Numerical entropy and adaptivity for finite volume schemes. Commu- nications in Computational Physics 10(5), 1132–1160 (2011)
work page 2011
-
[35]
IMA Journal of Numerical Analysis 43(1), 1–43 (2023)
Mishra, S., Molinaro, R.: Estimates on the generalization error of physics-informed neural networks for approximating pdes. IMA Journal of Numerical Analysis 43(1), 1–43 (2023)
work page 2023
-
[36]
Journal de mathématiques pures et appliquées 73(6), 523–566 (1994)
Lebaud, M.-P.: Description de la formation d’un choc dans le p-système. Journal de mathématiques pures et appliquées 73(6), 523–566 (1994)
work page 1994
-
[37]
Nonlinearity 35(2), 954–997 (2022)
Yin, H., Zhu, L.: The shock formation and optimal regularities of the resulting shock curves for 1d scalar conservation laws. Nonlinearity 35(2), 954–997 (2022)
work page 2022
-
[38]
Neural Networks 165, 721–739 (2023)
Longo, M., Opschoor, J.A., Disch, N., Schwab, C., Zech, J.: De Rham compatible deep neural network FEM. Neural Networks 165, 721–739 (2023)
work page 2023
-
[39]
He, J., Li, L., Xu, J., Zheng, C.: Relu deep neural networks and linear finite elements. Journal of Computational Mathematics 38(3), 502–527 (2020) A Appendix A.1 Neural network approximation In this appendix, we adapt the min–max construction underlying the ReLU representability of CPwL finite-element functions on simplicial meshes; see, e.g., [ 38, 39]....
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.