pith. sign in

arxiv: 2604.04971 · v1 · submitted 2026-04-04 · 💻 cs.LG · cs.NA· math.NA· physics.comp-ph

A Theory-guided Weighted L² Loss for solving the BGK model via Physics-informed neural networks

Pith reviewed 2026-05-13 18:27 UTC · model grok-4.3

classification 💻 cs.LG cs.NAmath.NAphysics.comp-ph
keywords BGK modelphysics-informed neural networksweighted L2 lossstability estimatekinetic equationsmacroscopic momentsconvergence
0
0 comments X

The pith

A velocity-weighted L2 loss guarantees that physics-informed neural networks converge to the true solution of the BGK model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard L2 losses in PINNs applied to the BGK kinetic model do not ensure accurate recovery of the macroscopic moments, so the network can converge to unphysical states even when the pointwise residual is small. The paper replaces the uniform loss with a velocity-weighted version that applies stronger penalties at high velocities to control those moments. A stability estimate then shows that driving the weighted loss to zero forces the network output to approach the exact solution. This matters because it supplies a provable route to reliable PINN solutions for kinetic equations instead of relying on ad-hoc fixes or extra moment constraints. Experiments across several BGK benchmarks confirm higher accuracy and better robustness than the unweighted baseline.

Core claim

Minimizing the proposed velocity-weighted L2 loss guarantees convergence of the approximate solution to the exact BGK solution, because the weighting produces a stability estimate that bounds the error in the macroscopic moments.

What carries the argument

The velocity-weighted L2 loss, which multiplies the residual by a function of velocity that grows in the high-velocity tails, together with the stability estimate that converts loss decay into solution convergence.

If this is right

  • Driving the weighted loss to zero forces both the distribution function and its velocity moments to converge to the exact BGK solution.
  • No auxiliary moment-matching terms are required to obtain physically consistent predictions.
  • The same weighting strategy remains effective when the collision frequency or Mach number varies across the tested regimes.
  • The stability argument supplies an explicit path for proving convergence on other velocity-space discretizations or boundary conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The weighting idea could transfer directly to other relaxation-type kinetic models whose moment errors concentrate at high velocities.
  • Training schedules that gradually increase the weight strength might further improve optimization stability on stiff problems.
  • Similar loss re-weighting could be tested on Fokker-Planck or Boltzmann collision operators to check whether the stability pattern generalizes.

Load-bearing premise

The chosen velocity weighting function controls all relevant moment errors across the full range of BGK regimes without introducing new instabilities in the optimization.

What would settle it

A concrete counter-example in which the weighted loss is driven below a small threshold yet the computed macroscopic moments still deviate by more than a fixed tolerance from the true values would disprove the convergence guarantee.

Figures

Figures reproduced from arXiv: 2604.04971 by Gyounghun Ko, Myeong-Su Lee, Seung Yeon Cho, Sung-Jun Son.

Figure 1
Figure 1. Figure 1: Visualization of the function Kε(v) for ε = 0.01. (2) Its contribution to macroscopic moments (specifically energy) is given as follows: Z R3   1 v |v| 2   Kε(v) =   O(ε) 0 1 + O(ε)   . In the rest of this section, we show that when this function Kε is added as a perturbation to the initial condition or the PDE source term of (3.1), the resulting L 2 PINN loss remains small (O(ε 2 )) due to the fir… view at source ↗
Figure 2
Figure 2. Figure 2: Loss-accuracy curves for the explicit counterexamples: (a) Example 1 and (b) Example 2. As demonstrated in the two examples above, these counterexamples are not pathological mathematical artifacts. Neither example exhibits singular behaviors, such as blowing up or vanishing within a finite time. The macroscopic moments (mass, momentum, and energy) of these approximate solutions remain strictly positive and… view at source ↗
Figure 3
Figure 3. Figure 3: Relative error curves of the distribution function f and macroscopic moments (ρ, ux, T) according to the polynomial growth rate β at Kn = 0.01 [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Distribution function f (top row) and macroscopic moments (ρ, ux, T) (bottom row) at t = 0.1 for the 1D Smooth problem at Kn = 0.01 [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Relative error curves of the distribution function f and macroscopic moments (ρ, ux, T) according to the polynomial growth rate β at Kn = 0.1 [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Relative error curves of the distribution function f and macroscopic moments (ρ, ux, T) according to the polynomial growth rate β at Kn = 1.0. macroscopic states that generate complex wave structures, including shock waves, contact discontinuities, and rarefaction waves. Specifically, it is given on the spatial domain x ∈ (−0.5, 0.5) with homogeneous Neumann boundary conditions applied at x = ±0.5. To faci… view at source ↗
Figure 7
Figure 7. Figure 7: Distribution function f (top row) and macroscopic moments (ρ, ux, T) (bottom row) at t = 0.1 for the 1D Riemann problem at Kn = 1.0. The simulation is performed over the time interval t ∈ (0, 0.1], and for the numerical implementation, the microscopic velocity space is truncated to the computational domain v ∈ [−10, 10]3 . At each training iteration, we independently sample Nt = 12, Nx = 16, Ny = 16, and N… view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of weight shapes between the relative loss and the proposed loss for the 1D smooth problem at Kn = 0.01. 5.7. Discussion on the results. The numerical results show that our proposed weight function, w(v) = 1 +α|v| β , consistently provided stable and accurate predictions across all tested cases. However, the relative loss method sometimes showed similar or even better results, which can be expla… view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of weight shapes between the relative loss and the proposed loss for the 1D Riemann problem at Kn = 1.0. where the relative loss yields higher errors for the density ρ compared to the unweighted standard L 2 loss across all evaluated Knudsen numbers (see [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
read the original abstract

While Physics-Informed Neural Networks offer a promising framework for solving partial differential equations, the standard $L^2$ loss formulation is fundamentally insufficient when applied to the Bhatnagar-Gross-Krook (BGK) model. Specifically, simply minimizing the standard loss does not guarantee accurate predictions of the macroscopic moments, causing the approximate solutions to fail in capturing the true physical solution. To overcome this limitation, we introduce a velocity-weighted $L^2$ loss function designed to effectively penalize errors in the high-velocity regions. By establishing a stability estimate for the proposed approach, we shows that minimizing the proposed weighted loss guarantees the convergence of the approximate solution. Also, numerical experiments demonstrate that employing this weighted PINN loss leads to superior accuracy and robustness across various benchmarks compared to the standard approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript proposes a velocity-weighted L² loss for Physics-Informed Neural Networks solving the BGK kinetic model. It derives a stability estimate asserting that minimization of this weighted loss guarantees convergence of the approximate solution (including accurate macroscopic moments), and reports numerical experiments showing improved accuracy and robustness over standard L² PINNs on various benchmarks.

Significance. If the stability estimate can be rigorously extended to the discrete collocation/quadrature setting used in PINN training, the approach would provide a theoretically grounded improvement for controlling moment errors in kinetic models, which is relevant for rarefied gas dynamics applications. The numerical results, if reproducible, would support practical utility, but the current gap between continuous analysis and empirical training limits the strength of the contribution.

major comments (2)
  1. [stability estimate section] Stability estimate section (likely §3 or §4): the estimate is derived only for the exact continuous weighted L² integral loss over velocity space. The PINN implementation replaces this with a finite-sum empirical loss at collocation points; no a-priori bound is given on the quadrature error or on the distance from the trained network to a global minimizer of the continuous loss. This breaks the claimed guarantee that small training loss implies small solution error.
  2. [numerical experiments] Numerical experiments section: the reported superior performance relies on a specific post-hoc choice of velocity weighting function. No sensitivity analysis or theoretical justification is provided showing that this weighting controls all relevant moment errors uniformly across Knudsen regimes without introducing new optimization instabilities.
minor comments (3)
  1. [abstract] Abstract: grammatical error 'we shows' should be 'we show'.
  2. [throughout] Notation: ensure the weighted loss functional is defined with consistent symbols between the theoretical analysis and the PINN implementation sections.
  3. [figures/tables] Figures: verify that error plots and tables clearly label the weighted vs. standard loss cases and include quantitative moment errors (density, momentum, energy) for direct comparison.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments, which help clarify the scope of our contributions. We provide point-by-point responses below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: Stability estimate section (likely §3 or §4): the estimate is derived only for the exact continuous weighted L² integral loss over velocity space. The PINN implementation replaces this with a finite-sum empirical loss at collocation points; no a-priori bound is given on the quadrature error or on the distance from the trained network to a global minimizer of the continuous loss. This breaks the claimed guarantee that small training loss implies small solution error.

    Authors: We agree that the stability estimate is derived strictly in the continuous setting. The PINN implementation employs a discrete collocation approximation, and we do not supply a priori bounds on quadrature error or the gap to a global minimizer. In the revision we will explicitly distinguish the continuous analysis from the discrete training procedure, moderate the language from 'guarantees' to 'provides theoretical support for,' and note that the numerical results serve as empirical validation rather than a rigorous proof of the discrete case. A complete error analysis bridging the two settings is beyond the present scope. revision: partial

  2. Referee: Numerical experiments section: the reported superior performance relies on a specific post-hoc choice of velocity weighting function. No sensitivity analysis or theoretical justification is provided showing that this weighting controls all relevant moment errors uniformly across Knudsen regimes without introducing new optimization instabilities.

    Authors: The weighting is chosen to penalize high-velocity errors in accordance with the stability analysis. We will add a sensitivity study to the revised numerical section, varying the weighting parameter and reporting moment errors across a range of Knudsen numbers. The additional experiments confirm consistent improvement without introducing observable optimization instabilities; a brief discussion of training robustness will be included. revision: yes

standing simulated objections not resolved
  • Rigorous a-priori bounds on quadrature error and optimization gap that would extend the continuous stability estimate to the discrete PINN training setting

Circularity Check

0 steps flagged

Stability estimate derived via independent analysis; no reduction to inputs or self-referential definitions

full rationale

The paper's core claim rests on establishing a stability estimate showing that minimization of the velocity-weighted L² loss implies convergence of the approximate solution to the BGK model. This estimate is presented as an analytical result obtained from the continuous loss functional, not from fitting parameters to data or redefining quantities in terms of themselves. No equations or steps in the abstract reduce the claimed guarantee to a tautology or to a fitted input relabeled as a prediction. The derivation chain remains self-contained against external mathematical benchmarks, with the weighted loss introduced as a modification whose properties are analyzed directly rather than assumed via prior self-citations or ansatzes. Minor self-citation (if present in the full text) is not load-bearing for the stability result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the standard BGK kinetic model and the PINN framework. The new element is the weighted loss and its associated stability estimate; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption The BGK collision operator and moment definitions are the correct physical model for the target regimes.
    Invoked as the underlying PDE system the neural network must satisfy.

pith-pipeline@v0.9.0 · 5454 in / 1165 out tokens · 25898 ms · 2026-05-13T18:27:04.340063+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

  1. [1]

    E. Abdo, L. Chai, R. Hu, and X. Yang. Error estimates of physics-informed neural networks for approximating boltzmann equation.arXiv preprint arXiv:2407.08383, 2024

  2. [2]

    P. L. Bhatnagar, E. P. Gross, and M. Krook. A model for collision processes in gases. i. small amplitude processes in charged and neutral one-component systems.Physical review, 94(3):511, 1954

  3. [3]

    G. A. Bird.Molecular gas dynamics and the direct simulation of gas flows. Oxford university press, 1994

  4. [4]

    Boscarino, S.-Y

    S. Boscarino, S.-Y. Cho, G. Russo, and S.-B. Yun. High order conservative semi-lagrangian scheme for the bgk model of the boltzmann equation.Communications in Computational Physics, page 1–56, 2020

  5. [5]

    S. Cai, Z. Mao, Z. Wang, M. Yin, and G. E. Karniadakis. Physics-informed neural networks (pinns) for fluid mechanics: A review.Acta Mechanica Sinica, 37(12):1727–1738, 2021

  6. [6]

    Cercignani

    C. Cercignani. The boltzmann equation. InThe Boltzmann equation and its applications, pages 40–103. Springer, 1988

  7. [7]

    X. Chen, C. Liang, D. Huang, E. Real, K. Wang, H. Pham, X. Dong, T. Luong, C.-J. Hsieh, Y. Lu, et al. Symbolic discovery of optimization algorithms.Advances in neural information processing systems, 36:49205–49233, 2023

  8. [8]

    J. Cho, S. Nam, H. Yang, S.-B. Yun, Y. Hong, and E. Park. Separable physics-informed neural networks.Advances in Neural Information Processing Systems, 36:23761–23788, 2023

  9. [9]

    S. Y. Cho, S. Boscarino, G. Russo, and S.-B. Yun. Conservative semi-lagrangian schemes for kinetic equations part ii: Applications.Journal of Computational Physics, 436:110281, 2021

  10. [10]

    S. Y. Cho, Y.-P. Choi, B.-H. Hwang, and S. Song. From kinetic mixtures to compressible two-phase flow: A bgk-type model and rigorous derivation.Kinetic and Related Models, 22(0):147–196, 2026

  11. [11]

    S.-Y. Cho, M. Groppi, J.-M. Qiu, G. Russo, and S.-B. Yun. Conservative semi-lagrangian methods for kinetic equations. InActive Particles, Volume 4: Theory, Models, Applications, pages 283–420. Springer, 2024

  12. [12]

    De Ryck and S

    T. De Ryck and S. Mishra. Numerical analysis of physics-informed neural networks and related models in physics-informed machine learning.Acta Numerica, 33:633–713, 2024

  13. [13]

    Dimarco and L

    G. Dimarco and L. Pareschi. Numerical methods for kinetic equations.Acta Numerica, 23:369–520, 2014

  14. [14]

    L. C. Evans.Partial differential equations, volume 19. American mathematical society, 2022

  15. [15]

    G.-M. Gie, Y. Hong, and C.-Y. Jung. Semi-analytic pinn methods for singularly perturbed boundary value problems. Applicable Analysis, 103(14):2554–2571, 2024

  16. [16]

    Gilbarg, N

    D. Gilbarg, N. S. Trudinger, D. Gilbarg, and N. Trudinger.Elliptic partial differential equations of second order, volume 2. Springer, 1998

  17. [17]

    H. J. Hwang, J. W. Jang, H. Jo, and J. Y. Lee. Trend to equilibrium for the kinetic fokker-planck equation via the neural network approach.Journal of Computational Physics, 419:109665, 2020

  18. [18]

    S. Jin, Z. Ma, and T.-a. Zhang. Asymptotic-preserving neural networks for multiscale vlasov–poisson–fokker–planck system in the high-field regime.Journal of Scientific Computing, 99(3):61, 2024

  19. [19]

    S. Jo, S. Park, J. Shin, J. Park, H. Kim, S. Ko, S. Lee, and J. Jeon. Task-aware evolution in physics-informed neural networks: Application to saint-venant torsion problems.Engineering Applications of Artificial Intelligence, 168:113988, 2026

  20. [20]

    G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang. Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021

  21. [21]

    Kim, M.-S

    D. Kim, M.-S. Lee, and S.-B. Yun. Stationary bgk models for chemically reacting gas in a slab.Journal of Statistical Physics, 184(2):24, 2021

  22. [22]

    Kim, S.-J

    M. Kim, S.-J. Son, Y. Kim, and D. Lee. A physics-informed, global-in-time neural particle method for the spatially homogeneous landau equation.arXiv preprint arXiv:2603.10874, 2026

  23. [23]

    Kim, S.-B

    S. Kim, S.-B. Yun, H.-O. Bae, M. Lee, and Y. Hong. Physics-informed convolutional transformer for predicting volatility surface.Quantitative Finance, 24(2):203–220, 2024

  24. [24]

    D. P. Kingma and J. Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014

  25. [25]

    J. Y. Lee, J. Jang, and H. J. Hwang. oppinn: Physics-informed neural network with operator learning to approximate solutions to the fokker-planck-landau equation.Journal of Computational Physics, 480:112031, 2023

  26. [26]

    M.-S. Lee, J. Oh, D.-C. Lee, K. Lee, S. Park, and Y. Hong. Forward and inverse simulation of pseudo-two-dimensional model of lithium-ion batteries using neural networks.Computer Methods in Applied Mechanics and Engineering, 438:117856, 2025

  27. [27]

    R. Li, E. Lee, and T. Luo. Physics-informed neural networks for solving multiscale mode-resolved phonon boltzmann transport equation.Materials Today Physics, 19:100429, 2021. 26 GYOUNGHUN KO, SUNG-JUN SON, SEUNG YEON CHO, AND MYEONG-SU LEE

  28. [28]

    Mieussens

    L. Mieussens. Discrete velocity model and implicit scheme for the bgk equation of rarefied gas dynamics.Mathematical Models and Methods in Applied Sciences, 10(08):1121–1149, 2000

  29. [29]

    J. Oh, S. Y. Cho, S.-B. Yun, E. Park, and Y. Hong. Separable physics-informed neural networks for solving the bgk model of the boltzmann equation.SIAM Journal on Scientific Computing, 47(2):C451–C474, 2025

  30. [30]

    Pieraccini and G

    S. Pieraccini and G. Puppo. Implicit–explicit schemes for bgk kinetic equations.Journal of Scientific Computing, 32(1):1– 28, 2007

  31. [31]

    Raissi, P

    M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational physics, 378:686– 707, 2019

  32. [32]

    Saint-Raymond

    L. Saint-Raymond. Du mod` ele bgk de l’´ equation de boltzmann aux ´ equations d’euler des fluides incompressibles.Bulletin des sciences mathematiques, 126(6):493–506, 2002

  33. [33]

    W. A. Sands, J.-M. Qiu, D. Hayes, and N. Zheng. An adaptive-rank approach with greedy sampling for multi-scale bgk equations.Journal of Computational Physics, page 114523, 2025

  34. [34]

    Son.L P -solutions to the ES-BGK model of the polyatomic molecules.J

    S.-J. Son.L P -solutions to the ES-BGK model of the polyatomic molecules.J. Math. Phys., 65(10):Paper No. 101501, 23, 2024

  35. [35]

    Son and S.-B

    S.-j. Son and S.-B. Yun. Cauchy problem for the ES-BGK model with the correct Prandtl number.Partial Differ. Equ. Appl., 3(3):Paper No. 41, 9, 2022

  36. [36]

    Son and S.-B

    S.-j. Son and S.-B. Yun. The ES-BGK for the polyatomic molecules with infinite energy.J. Stat. Phys., 190(8):Paper No. 129, 27, 2023

  37. [37]

    Son and S.-B

    S.-J. Son and S.-B. Yun. Local in time solution to ES-BGK model with correct Prandtl number.Kinet. Relat. Models, 18(4):499–519, 2025

  38. [38]

    Sone.Molecular gas dynamics: theory, techniques, and applications

    Y. Sone.Molecular gas dynamics: theory, techniques, and applications. Springer, 2007

  39. [39]

    C. Wang, S. Li, D. He, and L. Wang. Isl 2 physics informed loss always suitable for training physics informed neural network?Advances in Neural Information Processing Systems, 35:8278–8290, 2022

  40. [40]

    S.-B. Yun. Cauchy problem for the boltzmann-bgk model near a global maxwellian.Journal of mathematical physics, 51(12), 2010

  41. [41]

    S.-B. Yun. Ellipsoidal bgk model near a global maxwellian.SIAM Journal on Mathematical Analysis, 47(3):2324–2354, 2015

  42. [42]

    S.-B. Yun. Ellipsoidal BGK model for polyatomic molecules near Maxwellians: a dichotomy in the dissipation estimate.J. Differential Equations, 266(9):5566–5614, 2019

  43. [43]

    Zhang, G

    B. Zhang, G. Cai, H. Weng, W. Wang, L. Liu, and B. He. Physics-informed neural networks for solving forward and inverse vlasov–poisson equation via fully kinetic simulation.Machine Learning: Science and Technology, 4(4):045015, 2023. Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing, 100190, China Center for Mathematical Ma...