Coupling-Robust Accuracy in Multiphysics Physics Informed Neural Networks via Kronecker-Preconditioned Optimization
Pith reviewed 2026-05-25 05:14 UTC · model grok-4.3
The pith
Block-diagonal Gauss-Newton preconditioning bounds the preconditioned NTK spectral radius by the number of networks, independent of coupling strength.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For linearly coupled systems the standard NTK spectral radius grows as Ω(γ²) with coupling strength γ, shrinking the stable learning rate, while block-diagonal Gauss-Newton preconditioning produces a preconditioned NTK K_P = J H⁺ J^T whose spectral radius remains bounded by S, the number of networks, independent of γ. The combination of this preconditioner with the SOAP optimizer and inverse-gradient-norm loss balancing keeps L2 error degradation ≤ 1.1× across 234 experiments and succeeds on a 2D six-PDE electroosmotic flow problem where the baseline fails.
What carries the argument
The preconditioned neural tangent kernel K_P = J H⁺ J^T where H is the block-diagonal Gauss-Newton Hessian; it removes the quadratic growth in spectral radius that coupling strength otherwise produces.
If this is right
- Stable learning rates no longer shrink as coupling strength rises.
- Final-epoch L2 error ratio between strong and weak coupling stays below 1.1 across symmetric, asymmetric, and nonlinear PDEs.
- The method reaches a 2D six-PDE electroosmotic flow regime at EDL-resolved conditions where standard Adam training yields L2 error above 0.9.
- Numerical checks confirm λ_max(K_P) equals S with equality in all tested cases.
Where Pith is reading between the lines
- The same block-diagonal preconditioning idea could be tested on other first-order optimization methods outside the SOAP family.
- If the block-diagonal approximation is relaxed to approximate blocks, the spectral-radius bound might still hold approximately for weakly coupled subsystems.
- The independence from γ suggests the approach could be combined with adaptive loss weighting schemes that further reduce manual tuning.
Load-bearing premise
The Gauss-Newton Hessian can be treated as block-diagonal across the coupled equations.
What would settle it
An experiment measuring the largest eigenvalue of the preconditioned NTK on a linearly coupled system and finding it larger than S or increasing with γ would falsify the bound.
Figures
read the original abstract
Physics-informed neural networks (PINNs) for coupled multiphysics systems suffer systematic accuracy degradation as inter-equation coupling strengthens. We provide a theoretical explanation for this phenomenon through neural tangent kernel (NTK) analysis: for linearly coupled systems, we prove that the standard NTK's spectral radius grows as $\Omega(\gamma^2)$ with coupling strength $\gamma$, shrinking the stable learning rate, while block-diagonal Gauss--Newton (GN) preconditioning yields a preconditioned NTK $K_P = J H^{+} J^\top$ (where $H$ is the block-diagonal GN Hessian) whose spectral radius is bounded by $S$ ($S$ = number of networks), independent of $\gamma$. We verify the $\Omega(\gamma^2)$ growth numerically across symmetric, asymmetric, and nonlinear coupled PDE systems, and confirm $\lambda_{\max}(K_P) = S$ with equality in all cases. Combining the Kronecker-preconditioned optimizer SOAP with inverse-gradient-norm loss balancing (SOAP+GN) yields coupling-robust accuracy: across 234 experiments spanning three 1D systems of increasing nonlinearity and a 2D electroosmotic flow benchmark, SOAP+GN maintains final-epoch $L_2$ degradation $\leq 1.1\times$ (ratio of strong- to weak-coupling error) even as coupling parameters vary over one to two orders of magnitude, compared with $> 10^2\times$ for Adam+GN. SOAP+GN further scales to a 2D, 6-PDE electroosmotic flow system at EDL-resolved conditions -- a regime that all prior PINN electrokinetics studies have avoided through simplified physics -- where Adam+GN fails entirely ($L_2 > 0.9$).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that PINNs for multiphysics systems suffer accuracy loss with increasing inter-equation coupling strength γ because the standard NTK spectral radius grows as Ω(γ²). For linearly coupled systems it proves that block-diagonal Gauss-Newton preconditioning produces a preconditioned NTK K_P = J H⁺ Jᵀ whose spectral radius is bounded by S (number of networks) independent of γ. Numerical checks confirm λ_max(K_P)=S exactly across symmetric, asymmetric, and nonlinear cases; the SOAP+GN optimizer then yields coupling-robust L₂ accuracy (degradation ≤1.1×) over 234 experiments on three 1D systems and a 2D electroosmotic-flow benchmark, while Adam+GN degrades by >10²× and fails on the full 6-PDE 2D case.
Significance. If the bound and the numerical equality hold, the work supplies both an NTK-based explanation for a known PINN failure mode and a concrete, scalable optimizer fix that preserves accuracy across wide ranges of coupling. The explicit linear-case proof, the exact match λ_max(K_P)=S in every reported experiment, and the demonstration on a previously intractable 2D electrokinetics regime constitute clear strengths.
major comments (2)
- [§3] §3 (linear analysis): the claimed bound λ_max(K_P) ≤ S independent of γ follows directly from the block-diagonal definition of the GN Hessian H together with the form K_P = J H⁺ Jᵀ; the manuscript should state explicitly whether any additional steps beyond this definition are required to obtain independence from γ, so that the non-triviality of the result is clear.
- [§5] Experiments (§5 and associated tables/figures): the assertion that λ_max(K_P)=S holds with equality in all 234 experiments (including nonlinear systems) is load-bearing for the central claim, yet the text supplies neither the precise method used to compute the spectral radius, numerical tolerance or error analysis for the eigenvalue estimates, nor the rules for including or excluding runs; these details are needed to verify the reported equality beyond the linear case.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address the two major comments point by point below. Both can be resolved by targeted revisions that add the requested explicit statements and computational details without altering the core claims or results.
read point-by-point responses
-
Referee: [§3] §3 (linear analysis): the claimed bound λ_max(K_P) ≤ S independent of γ follows directly from the block-diagonal definition of the GN Hessian H together with the form K_P = J H⁺ Jᵀ; the manuscript should state explicitly whether any additional steps beyond this definition are required to obtain independence from γ, so that the non-triviality of the result is clear.
Authors: We agree that the bound follows directly once the block-diagonal structure of H and the expression K_P = J H⁺ Jᵀ are in place. In §3 the derivation proceeds by first writing the block-diagonal GN Hessian, inverting it block-wise, and substituting into K_P; the spectral radius is then bounded by S because each of the S diagonal blocks contributes a term whose eigenvalues are at most 1 after preconditioning, with the off-block coupling terms eliminated by the block-diagonal inverse. No further assumptions on γ or on the form of the coupling matrices are used. We will revise the opening paragraph of §3 to state this explicitly, making clear that independence from γ is an immediate structural consequence of the chosen preconditioner. revision: yes
-
Referee: [§5] Experiments (§5 and associated tables/figures): the assertion that λ_max(K_P)=S holds with equality in all 234 experiments (including nonlinear systems) is load-bearing for the central claim, yet the text supplies neither the precise method used to compute the spectral radius, numerical tolerance or error analysis for the eigenvalue estimates, nor the rules for including or excluding runs; these details are needed to verify the reported equality beyond the linear case.
Authors: We acknowledge the omission of these implementation details. The matrices K_P were formed explicitly from the Jacobians and block-diagonal GN Hessians at selected training epochs; eigenvalues were obtained via SciPy.linalg.eigh (or NumPy.linalg.eigvals for smaller cases) with double-precision arithmetic. We declared equality when |λ_max − S| < 1e−5, a threshold chosen after observing that the deviation never exceeded machine-epsilon scaling for the problem sizes involved. All 234 completed runs are included; no runs were excluded on the basis of eigenvalue results. We will insert a short methods paragraph (new §5.1) that records the linear-algebra routine, tolerance, floating-point considerations, and inclusion rule so that the equality claim can be reproduced. revision: yes
Circularity Check
No significant circularity; bound is a direct mathematical consequence of the block-diagonal definition
full rationale
The paper's central theoretical claim derives the Ω(γ²) growth for the un-preconditioned NTK and then defines the preconditioned NTK explicitly as K_P = J H⁺ Jᵀ with H block-diagonal across the S networks. The stated bound λ_max(K_P) ≤ S (with equality observed) follows immediately from the block-diagonal structure of H and the resulting decomposition of K_P; this is a property of the chosen object rather than a reduction of the target accuracy metric to fitted parameters or to a self-citation chain. No load-bearing step equates a prediction to its own inputs by construction, and the numerical experiments (234 runs) serve as external verification rather than the source of the bound. The derivation chain is therefore self-contained against the paper's own equations.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The neural tangent kernel analysis applies directly to the loss of linearly coupled multiphysics PINN systems
- domain assumption The Gauss-Newton Hessian admits a block-diagonal structure across the coupled equations
Reference graph
Works this paper leans on
-
[1]
Maziar Raissi, Paris Perdikaris, and George E Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378: 0 686--707, 2019
work page 2019
-
[2]
Physics-informed machine learning
George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning. Nature Reviews Physics, 3 0 (6): 0 422--440, 2021
work page 2021
-
[3]
When and why PINN s fail to train: A neural tangent kernel perspective
Sifan Wang, Xinling Yu, and Paris Perdikaris. When and why PINN s fail to train: A neural tangent kernel perspective. Journal of Computational Physics, 449: 0 110768, 2022
work page 2022
-
[4]
PINN acle: A comprehensive benchmark of physics-informed neural networks for solving PDE s
Zhongkai Hao, Jiachen Yao, Chang Su, Hang Su, Ziao Wang, Fanzhi Lu, Zeyu Xia, Yichi Zhang, Songming Liu, Lu Lu, et al. PINN acle: A comprehensive benchmark of physics-informed neural networks for solving PDE s. Advances in Neural Information Processing Systems, 37: 0 76721--76774, 2024
work page 2024
-
[5]
A physics-informed neural network framework for multi-physics coupling microfluidic problems
Runze Sun, Hyogu Jeong, Jiachen Zhao, Yixing Gou, Emilie Sauret, Zirui Li, and Yuantong Gu. A physics-informed neural network framework for multi-physics coupling microfluidic problems. Computers & Fluids, 284: 0 106421, 2024
work page 2024
-
[6]
Understanding and mitigating gradient flow pathologies in physics-informed neural networks
Sifan Wang, Yujun Teng, and Paris Perdikaris. Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing, 43 0 (5): 0 A3055--A3081, 2021
work page 2021
-
[7]
Self-adaptive physics-informed neural networks
Levi D McClenny and Ulisses M Braga-Neto. Self-adaptive physics-informed neural networks. Journal of Computational Physics, 474: 0 111722, 2023
work page 2023
-
[8]
Residual-based attention in physics-informed neural networks
Sokratis J Anagnostopoulos, Juan Diego Toscano, Nikolaos Stergiopulos, and George Em Karniadakis. Residual-based attention in physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering, 421: 0 116805, 2024
work page 2024
-
[9]
Characterizing possible failure modes in physics-informed neural networks
Aditi Krishnapriyan, Amir Gholami, Shandian Zhe, Robert Kirby, and Michael W Mahoney. Characterizing possible failure modes in physics-informed neural networks. Advances in neural information processing systems, 34: 0 26548--26560, 2021
work page 2021
-
[10]
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas, Depen Morwani, Rosie Zhao, Mujin Kwun, Itai Shapira, David Brandfonbrener, Lucas Janson, and Sham Kakade. SOAP : Improving and stabilizing S hampoo using A dam. arXiv preprint arXiv:2409.11321, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[11]
Gradient alignment in physics-informed neural networks: A second-order optimization perspective
Sifan Wang, Ananyae Kumar Bhartari, Bowen Li, and Paris Perdikaris. Gradient alignment in physics-informed neural networks: A second-order optimization perspective. arXiv preprint arXiv:2502.00604, 2025
-
[12]
LSTM-PINN : An hybrid method for prediction of steady-state electrohydrodynamic flow
Ze Tao, Ke Xu, and Fujun Liu. LSTM-PINN : An hybrid method for prediction of steady-state electrohydrodynamic flow. Journal of Computational Physics, page 114586, 2025
work page 2025
-
[13]
Enriched physics-informed neural networks for dynamic poisson-nernst-planck systems
Xujia Huang, Fajie Wang, Benrong Zhang, and Hanqing Liu. Enriched physics-informed neural networks for dynamic poisson-nernst-planck systems. Mathematics and Computers in Simulation, 237: 0 231--246, 2025
work page 2025
-
[14]
Arshia Merdasi, Saman Ebrahimi, Xiang Yang, and Robert Kunz. Physics informed neural network application on mixing and heat transfer in combined electroosmotic-pressure driven flow. Chemical Engineering and Processing-Process Intensification, 193: 0 109540, 2023
work page 2023
-
[15]
Shengze Cai, Zhicheng Wang, Lu Lu, Tamer A Zaki, and George Em Karniadakis. Deepm&mnet: Inferring the electroconvection multiphysics fields based on operator approximation by neural networks. Journal of Computational Physics, 436: 0 110296, 2021
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.