Unified generalization analysis for physics informed neural networks

Tomoharu Iwata; Yuka Hashimoto

arxiv: 2605.13260 · v1 · pith:32CVCHKBnew · submitted 2026-05-13 · 💻 cs.LG · math.AP· math.FA· stat.ML

Unified generalization analysis for physics informed neural networks

Yuka Hashimoto , Tomoharu Iwata This is my paper

Pith reviewed 2026-05-14 19:56 UTC · model grok-4.3

classification 💻 cs.LG math.APmath.FAstat.ML

keywords generalization boundsphysics-informed neural networksPINNsVPINNsKoopman analysisTaylor expansiondifferential operators

0 comments

The pith

High-rank neural networks generalize well for PINNs and VPINNs even with differential operators, though nonlinearity enlarges the bounds exponentially.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper derives unified generalization bounds for neural networks that incorporate differentiation with respect to inputs, covering both Physics-Informed Neural Networks and their variational versions. It applies Taylor expansion to recast nonlinear differential operators as linear operators acting on a higher-dimensional feature space. This step enables Koopman operator techniques to analyze generalization without the stability or ellipticity assumptions required in prior work. The resulting bounds show that sufficiently high-rank networks can still generalize effectively, but the degree of nonlinearity in the operator causes an exponential widening of the bound. The results matter because they provide a concrete way to assess reliability of PINNs on scientific problems involving physical laws.

Core claim

We derive generalization bounds for neural networks that involve differentiation with respect to input variables, covering PINNs and VPINNs under a unified framework. We apply Taylor expansion to represent nonlinear differential operators as linear operators on a high-dimensional space, enabling the use of Koopman-based analysis and showing that high-rank networks can generalize well even in settings involving differential operators. We also show that the nonlinearity of the differential operator exponentially enlarges the bound, highlighting its significant impact on generalization.

What carries the argument

Taylor expansion that represents nonlinear differential operators as linear operators on an expanded high-dimensional space, enabling subsequent Koopman-based generalization analysis.

If this is right

High-rank networks can generalize well even in the presence of differential operators.
The nonlinearity of the differential operator causes an exponential enlargement of the generalization bound.
The same analysis framework applies uniformly to both PINNs and VPINNs without needing stability or linear-ellipticity assumptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Architectures could be selected by estimating the nonlinearity level of the target PDE in advance to keep the generalization bound manageable.
The same Taylor-plus-Koopman reduction might apply to other tasks that learn differential or integral operators.
Numerical checks on concrete nonlinear PDEs could confirm whether observed errors follow the predicted exponential scaling with nonlinearity.

Load-bearing premise

Nonlinear differential operators admit a sufficiently accurate linear representation on a high-dimensional space via Taylor expansion under suitable smoothness and boundedness conditions.

What would settle it

An experiment that measures the generalization gap of a PINN trained on a nonlinear PDE and checks whether the gap grows exponentially with increasing nonlinearity strength while holding network rank fixed.

Figures

Figures reproduced from arXiv: 2605.13260 by Tomoharu Iwata, Yuka Hashimoto.

**Figure 1.** Figure 1: (a) Test loss LVPINN with and without the regularization based on Therem 13 (Average ± standard deviation of three independent runs). (b) Test loss LPINN with and without the regularization based on Therem 13 (Average ± standard deviation of three independent runs) [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗

**Figure 2.** Figure 2: Scatter plot of the test error versus A˜ l/D1/2 l (Average of 3 independent runs) [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

read the original abstract

Physics-Informed Neural Networks (PINNs) and their variational counterparts (VPINNs) are neural networks that incorporate physical laws, making them useful for scientific problems. Existing generalization analyses for PINNs and VPINNs remain limited, often requiring restrictive assumptions such as stability conditions or linear ellipticity. In this paper, we derive generalization bounds for neural networks that involve differentiation with respect to input variables, covering PINNs and VPINNs under a unified framework. We apply Taylor expansion to represent nonlinear differential operators as linear operators on a high-dimensional space, enabling the use of Koopman-based analysis and showing that high-rank networks can generalize well even in settings involving differential operators. We also show that the nonlinearity of the differential operator exponentially enlarges the bound, highlighting its significant impact on generalization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper unifies generalization bounds for PINNs via Taylor lift to Koopman space but leaves the nonlinear remainder terms uncontrolled.

read the letter

The main point is that this work derives generalization bounds for PINNs and VPINNs in one framework by using Taylor expansion to turn nonlinear differential operators into linear ones on a higher-dimensional space, then applying Koopman analysis. It concludes that high-rank networks can still generalize and that nonlinearity makes the bound grow exponentially larger. This relaxes the stability and linear ellipticity assumptions that constrained earlier results, which is a clear step forward for the area. The unified treatment of standard and variational PINNs is practical, and the Koopman step gives a clean way to handle the differential parts of the loss. The dependence on network rank comes out in a usable form. The soft spot sits in the Taylor step. For typical nonlinear terms such as convective products or cubic reactions, the remainder involves higher derivatives whose size is not bounded independently of the network parameters or the underlying PDE solution. Without uniform control on that remainder over the hypothesis class, the claimed bounds become conditional on extra regularity that is not stated up front. The exponential enlargement and the positive result for high-rank networks both rest on this point holding. The rest of the derivation appears internally consistent with no circular definitions or free parameters. Citations track the relevant prior generalization work for PINNs. This paper is aimed at researchers who already follow theoretical analysis of physics-informed networks. Someone looking for a Koopman-based route to bounds that cover input derivatives will get concrete value from the framework, even if they end up tightening the remainder argument themselves. It deserves peer review. The gap it targets is real and the approach is distinct enough that referees can usefully check the linearization details and suggest fixes.

Referee Report

2 major / 2 minor

Summary. The paper derives generalization bounds for PINNs and VPINNs under a unified framework by applying Taylor expansion to represent nonlinear differential operators as linear operators on a high-dimensional augmented space. This enables Koopman-based analysis, leading to the claims that high-rank networks can generalize well even with differential operators and that nonlinearity of the operator exponentially enlarges the generalization bound.

Significance. If the central derivation holds with controlled remainders, the work provides a novel unified theoretical lens on generalization for physics-informed networks, potentially explaining empirical success of high-rank architectures and quantifying nonlinearity's impact. This could guide architecture selection in scientific machine learning applications involving PDEs.

major comments (2)

[Section 3 (Taylor expansion and Koopman lifting)] The Taylor linearization step (used to lift nonlinear operators such as convective or reaction terms to linear operators on an augmented feature space) does not provide uniform control over the Lagrange or integral remainder term. For the subsequent Rademacher or covering-number bounds to hold with the claimed dependence on network rank, the remainder must be shown to be o(1) uniformly over the hypothesis class and independent of network parameters; without this, the exponential enlargement claim and the 'high-rank networks generalize well' conclusion become conditional on unstated extra assumptions on solution regularity and higher derivatives.
[Theorem 4.2 (or equivalent main generalization result)] The transition from the linearized operator to the final generalization bound (likely in the main theorem) appears to absorb the nonlinearity factor directly into an exponential term, but this step requires explicit verification that the remainder does not scale with network complexity or the differential operator's order; otherwise the bound may not be load-bearing for the central claims.

minor comments (2)

[Section 2 (Preliminaries)] Notation for the augmented feature space and the rank parameter should be introduced with a clear definition and distinguished from standard network width to avoid confusion in the Koopman application.
[Abstract and Section 1] The abstract and introduction would benefit from a brief statement of the precise assumptions (e.g., smoothness class of the PDE solution) needed for the Taylor remainder to vanish uniformly.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments, which highlight important points for strengthening the rigor of our analysis. We address each major comment below and will revise the manuscript accordingly to make the assumptions and remainder controls explicit.

read point-by-point responses

Referee: [Section 3 (Taylor expansion and Koopman lifting)] The Taylor linearization step (used to lift nonlinear operators such as convective or reaction terms to linear operators on an augmented feature space) does not provide uniform control over the Lagrange or integral remainder term. For the subsequent Rademacher or covering-number bounds to hold with the claimed dependence on network rank, the remainder must be shown to be o(1) uniformly over the hypothesis class and independent of network parameters; without this, the exponential enlargement claim and the 'high-rank networks generalize well' conclusion become conditional on unstated extra assumptions on solution regularity and higher derivatives.

Authors: We agree that uniform control of the remainder is essential. The manuscript implicitly relies on standard PDE regularity (solutions in C^3 with bounded higher derivatives), under which the Lagrange remainder is bounded by a term depending only on the solution and operator coefficients, independent of network parameters. In the revision we will add an explicit lemma in Section 3 deriving this uniform o(1) bound over the hypothesis class, state the required regularity assumptions upfront, and update all theorem statements to list them. This makes the subsequent Rademacher bounds rigorous while preserving the dependence on network rank. revision: yes
Referee: [Theorem 4.2 (or equivalent main generalization result)] The transition from the linearized operator to the final generalization bound (likely in the main theorem) appears to absorb the nonlinearity factor directly into an exponential term, but this step requires explicit verification that the remainder does not scale with network complexity or the differential operator's order; otherwise the bound may not be load-bearing for the central claims.

Authors: The nonlinearity factor enters solely through the finite dimension of the Koopman-augmented space (fixed by the Taylor order), which is independent of network width, depth, or rank. We will insert a short proposition immediately before Theorem 4.2 that bounds the remainder contribution by a constant depending only on PDE data and solution regularity, confirming it does not grow with network complexity. The proof of the main theorem will be expanded to reference this step explicitly, ensuring the exponential enlargement is attributable only to operator nonlinearity. revision: yes

Circularity Check

0 steps flagged

No circularity: bounds derived from independent operator lifting and covering arguments

full rationale

The derivation applies Taylor expansion to lift nonlinear differential operators to linear operators on an augmented space, then invokes standard Rademacher or covering-number bounds on the resulting high-rank network class. This chain does not reduce any claimed bound to a fitted parameter, self-referential definition, or load-bearing self-citation; the Koopman step is a standard linearization technique whose remainder control is an external assumption rather than an internal tautology. The paper therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Central claim rests on Taylor expansion converting nonlinear operators to linear ones and on Koopman theory applying to the resulting high-dimensional linear system; both are standard mathematical tools whose specific applicability here is asserted but unverified from abstract alone.

axioms (2)

domain assumption Taylor expansion represents nonlinear differential operators as linear operators on a high-dimensional space
Invoked to enable Koopman-based generalization analysis for PINNs
standard math Koopman operator theory yields generalization bounds once the operator is linearized
Core step after the Taylor transformation

pith-pipeline@v0.9.0 · 5430 in / 1238 out tokens · 70767 ms · 2026-05-14T19:56:39.618009+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

Bartlett.Neural network learning: Theoretical foundations

Martin Anthony and Peter L. Bartlett.Neural network learning: Theoretical foundations. Cambridge University Press, 2009

work page 2009
[2]

Stronger generalization bounds for deep nets via a compression approach

Sanjeev Arora, Rong Ge, Behnam Neyshabur, and Yi Zhang. Stronger generalization bounds for deep nets via a compression approach. InProceedings of the 35th International Conference on Machine Learning (ICML), 2018

work page 2018
[3]

Spectrally-normalized margin bounds for neural networks

Peter L Bartlett, Dylan J Foster, and Matus J Telgarsky. Spectrally-normalized margin bounds for neural networks. InProceedings of the 31st Conference on Neural Information Processing Systems (NIPS), 2017

work page 2017
[4]

Bartlett and Shahar Mendelson

Peter L. Bartlett and Shahar Mendelson. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482, 2002

work page 2002
[5]

Solving pdes by variational physics-informed neural networks: an a posteriori error analysis.Annali dell’Universita di Ferrara, 68:575–595, 2022

Stefano Berrone, Claudio Canuto, and Moreno Pintore. Solving pdes by variational physics-informed neural networks: an a posteriori error analysis.Annali dell’Universita di Ferrara, 68:575–595, 2022

work page 2022
[6]

Kunal Bhardwaj, Alok Rai, and Subhajit Sanyal. A variational physics-informed neural network framework using petrov–galerkin method for solving singularly perturbed boundary value problems.Applied Mathematics and Computation, 451:127268, 2023

work page 2023
[7]

Using physics-informed neural networks for solving Navier-Stokes equations in complex scenarios.SSRN Electronic Journal, 2024

Tommaso Botarelli, Marco Fanfani, Paolo Nesi, and Lorenzo Pinelli. Using physics-informed neural networks for solving Navier-Stokes equations in complex scenarios.SSRN Electronic Journal, 2024

work page 2024
[8]

Physics-informed neural networks (pinns) for fluid mechanics: a review.Acta Mechanica Sinica, 37(12):1727–1738, 2021

Shengze Cai, Zhiping Mao, Zhicheng Wang, Minglang Yin, and George Em Karniadakis. Physics-informed neural networks (pinns) for fluid mechanics: a review.Acta Mechanica Sinica, 37(12):1727–1738, 2021

work page 2021
[9]

Crandall and Pierre-Louis Lions

Michael G. Crandall and Pierre-Louis Lions. Viscosity solutions of Hamilton–Jacobi equations.Transactions of the American Mathematical Society, 277(1):1–42, 1983

work page 1983
[10]

Lagrangian neural networks

Miles Cranmer, Sam Greydanus, Stephan Hoyer, Peter Battaglia, David Spergel, and Shirley Ho. Lagrangian neural networks. InICLR Workshop on Integration of Deep Neural Models and Differential Equations, 2020

work page 2020
[11]

Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 2010

work page 2010
[12]

Size-independent sample complexity of neural networks

Noah Golowich, Alexander Rakhlin, and Ohad Shamir. Size-independent sample complexity of neural networks. In Proceedings of the 2018 Conference On Learning Theory (COLT), 2018

work page 2018
[13]

Existence and uniqueness of viscosity solutions to the exterior problem of a parabolic Monge–Amp`ere equation.Communications on Pure and Applied Analysis, 19(10):4921–4936, 2020

Shuyu Gong, Ziwei Zhou, and Jiguang Bao. Existence and uniqueness of viscosity solutions to the exterior problem of a parabolic Monge–Amp`ere equation.Communications on Pure and Applied Analysis, 19(10):4921–4936, 2020

work page 2020
[14]

Hamiltonian neural networks

Sam Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. InProceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS), 2019

work page 2019
[15]

Nearly-tight VC-dimension bounds for piecewise linear neural networks

Nick Harvey, Christopher Liaw, and Abbas Mehrabian. Nearly-tight VC-dimension bounds for piecewise linear neural networks. InProceedings of the 2017 Conference on Learning Theory (COLT), pages 1064–1068, 2017. 10

work page 2017
[16]

Why high-rank neural networks generalize?: An algebraic framework with RKHSs

Yuka Hashimoto, Sho Sonoda, Isao Ishikawa, and Masahiro Ikeda. Why high-rank neural networks generalize?: An algebraic framework with RKHSs. InProceedings of the 14th International Conference on Learning Representations (ICLR), 2026

work page 2026
[17]

Koopman-based generalization bound: New aspect for full-rank weights

Yuka Hashimoto, Sho Sonoda, Isao Ishikawa, Atsushi Nitanda, and Taiji Suzuki. Koopman-based generalization bound: New aspect for full-rank weights. InProceedings of the 12th International Conference on Learning Representations (ICLR), 2024

work page 2024
[18]

Robust fine-tuning of deep neural networks with Hessian-based generalization guarantees

Haotian Ju, Dongyue Li, and Hongyang R Zhang. Robust fine-tuning of deep neural networks with Hessian-based generalization guarantees. InProceedings of the 39th International Conference on Machine Learning (ICML), 2022

work page 2022
[19]

Orthogonal deep neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(04):1352–1368, 2021

Shuai Li, Kui Jia, Yuxin Wen, Tongliang Liu, and Dacheng Tao. Orthogonal deep neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(04):1352–1368, 2021

work page 2021
[20]

Estimates for generalization error of physics-informed neural networks for approximating pdes.IMA Journal of Numerical Analysis, 43(1):1–43, 2023

Siddhartha Mishra and Roberto Molinaro. Estimates for generalization error of physics-informed neural networks for approximating pdes.IMA Journal of Numerical Analysis, 43(1):1–43, 2023

work page 2023
[21]

MIT press, 2nd edition, 2018

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.Foundations of Machine Learning. MIT press, 2nd edition, 2018

work page 2018
[22]

A PAC-bayesian approach to spectrally-normalized margin bounds for neural networks

Behnam Neyshabur, Srinadh Bhojanapalli, and Nathan Srebro. A PAC-bayesian approach to spectrally-normalized margin bounds for neural networks. InProceedings of the 6th International Conference on Learning Representations (ICLR), 2018

work page 2018
[23]

Norm-based capacity control in neural networks

Behnam Neyshabur, Ryota Tomioka, and Nathan Srebro. Norm-based capacity control in neural networks. In Proceedings of the 2015 Conference on Learning Theory (COLT), 2015

work page 2015
[24]

Maziar Raissi, Paris Perdikaris, and George Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378:686–707, 2019

work page 2019
[25]

Physics-informed neural networks: A review of methodological evolution, theoretical foundations, and interdisciplinary frontiers toward next-generation scientific computing

Zhiyuan Ren, Shijie Zhou, Dong Liu, and Qihe Liu. Physics-informed neural networks: A review of methodological evolution, theoretical foundations, and interdisciplinary frontiers toward next-generation scientific computing. Applied Sciences, 15(14):8092, 2025

work page 2025
[26]

Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

Taiji Suzuki, Hiroshi Abe, and Tomoaki Nishimura. Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network. InProceedings of the 8th International Conference on Learning Representations (ICLR), 2020

work page 2020
[27]

Data-dependent sample complexity of deep neural networks via Lipschitz augmentation

Colin Wei and Tengyu Ma. Data-dependent sample complexity of deep neural networks via Lipschitz augmentation. InProceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS), 2019

work page 2019
[28]

Improved sample complexities for deep neural networks and robust classification via an all-layer margin

Colin Wei and Tengyu Ma. Improved sample complexities for deep neural networks and robust classification via an all-layer margin. InProceedings of the 8th International Conference on Learning Representations (ICLR), 2020

work page 2020
[29]

The Barron space and the flow-induced function spaces for neural network models

Weinan E, Chao Ma, and Lei Wu. The Barron space and the flow-induced function spaces for neural network models. Constructive Approximation, 55:369–406, 2022

work page 2022
[30]

Refined generalization analysis of the deep ritz method and physics- informed neural networks

Xianliang Xu, Ye Li, and Zhongyi Huang. Refined generalization analysis of the deep ritz method and physics- informed neural networks. InProceedings of the 42nd International Conference on Machine Learning (ICML), 2025. 11 Appendix A Proofs Proof of Theorem 8By the Cauchy–Schwartz inequality and the Jensen’s inequality, we have E sup uθ∈UΘ 1 N NX n=1 ⟨pxn...

work page 2025

[1] [1]

Bartlett.Neural network learning: Theoretical foundations

Martin Anthony and Peter L. Bartlett.Neural network learning: Theoretical foundations. Cambridge University Press, 2009

work page 2009

[2] [2]

Stronger generalization bounds for deep nets via a compression approach

Sanjeev Arora, Rong Ge, Behnam Neyshabur, and Yi Zhang. Stronger generalization bounds for deep nets via a compression approach. InProceedings of the 35th International Conference on Machine Learning (ICML), 2018

work page 2018

[3] [3]

Spectrally-normalized margin bounds for neural networks

Peter L Bartlett, Dylan J Foster, and Matus J Telgarsky. Spectrally-normalized margin bounds for neural networks. InProceedings of the 31st Conference on Neural Information Processing Systems (NIPS), 2017

work page 2017

[4] [4]

Bartlett and Shahar Mendelson

Peter L. Bartlett and Shahar Mendelson. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482, 2002

work page 2002

[5] [5]

Solving pdes by variational physics-informed neural networks: an a posteriori error analysis.Annali dell’Universita di Ferrara, 68:575–595, 2022

Stefano Berrone, Claudio Canuto, and Moreno Pintore. Solving pdes by variational physics-informed neural networks: an a posteriori error analysis.Annali dell’Universita di Ferrara, 68:575–595, 2022

work page 2022

[6] [6]

Kunal Bhardwaj, Alok Rai, and Subhajit Sanyal. A variational physics-informed neural network framework using petrov–galerkin method for solving singularly perturbed boundary value problems.Applied Mathematics and Computation, 451:127268, 2023

work page 2023

[7] [7]

Using physics-informed neural networks for solving Navier-Stokes equations in complex scenarios.SSRN Electronic Journal, 2024

Tommaso Botarelli, Marco Fanfani, Paolo Nesi, and Lorenzo Pinelli. Using physics-informed neural networks for solving Navier-Stokes equations in complex scenarios.SSRN Electronic Journal, 2024

work page 2024

[8] [8]

Physics-informed neural networks (pinns) for fluid mechanics: a review.Acta Mechanica Sinica, 37(12):1727–1738, 2021

Shengze Cai, Zhiping Mao, Zhicheng Wang, Minglang Yin, and George Em Karniadakis. Physics-informed neural networks (pinns) for fluid mechanics: a review.Acta Mechanica Sinica, 37(12):1727–1738, 2021

work page 2021

[9] [9]

Crandall and Pierre-Louis Lions

Michael G. Crandall and Pierre-Louis Lions. Viscosity solutions of Hamilton–Jacobi equations.Transactions of the American Mathematical Society, 277(1):1–42, 1983

work page 1983

[10] [10]

Lagrangian neural networks

Miles Cranmer, Sam Greydanus, Stephan Hoyer, Peter Battaglia, David Spergel, and Shirley Ho. Lagrangian neural networks. InICLR Workshop on Integration of Deep Neural Models and Differential Equations, 2020

work page 2020

[11] [11]

Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 2010

work page 2010

[12] [12]

Size-independent sample complexity of neural networks

Noah Golowich, Alexander Rakhlin, and Ohad Shamir. Size-independent sample complexity of neural networks. In Proceedings of the 2018 Conference On Learning Theory (COLT), 2018

work page 2018

[13] [13]

Existence and uniqueness of viscosity solutions to the exterior problem of a parabolic Monge–Amp`ere equation.Communications on Pure and Applied Analysis, 19(10):4921–4936, 2020

Shuyu Gong, Ziwei Zhou, and Jiguang Bao. Existence and uniqueness of viscosity solutions to the exterior problem of a parabolic Monge–Amp`ere equation.Communications on Pure and Applied Analysis, 19(10):4921–4936, 2020

work page 2020

[14] [14]

Hamiltonian neural networks

Sam Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. InProceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS), 2019

work page 2019

[15] [15]

Nearly-tight VC-dimension bounds for piecewise linear neural networks

Nick Harvey, Christopher Liaw, and Abbas Mehrabian. Nearly-tight VC-dimension bounds for piecewise linear neural networks. InProceedings of the 2017 Conference on Learning Theory (COLT), pages 1064–1068, 2017. 10

work page 2017

[16] [16]

Why high-rank neural networks generalize?: An algebraic framework with RKHSs

Yuka Hashimoto, Sho Sonoda, Isao Ishikawa, and Masahiro Ikeda. Why high-rank neural networks generalize?: An algebraic framework with RKHSs. InProceedings of the 14th International Conference on Learning Representations (ICLR), 2026

work page 2026

[17] [17]

Koopman-based generalization bound: New aspect for full-rank weights

Yuka Hashimoto, Sho Sonoda, Isao Ishikawa, Atsushi Nitanda, and Taiji Suzuki. Koopman-based generalization bound: New aspect for full-rank weights. InProceedings of the 12th International Conference on Learning Representations (ICLR), 2024

work page 2024

[18] [18]

Robust fine-tuning of deep neural networks with Hessian-based generalization guarantees

Haotian Ju, Dongyue Li, and Hongyang R Zhang. Robust fine-tuning of deep neural networks with Hessian-based generalization guarantees. InProceedings of the 39th International Conference on Machine Learning (ICML), 2022

work page 2022

[19] [19]

Orthogonal deep neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(04):1352–1368, 2021

Shuai Li, Kui Jia, Yuxin Wen, Tongliang Liu, and Dacheng Tao. Orthogonal deep neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(04):1352–1368, 2021

work page 2021

[20] [20]

Estimates for generalization error of physics-informed neural networks for approximating pdes.IMA Journal of Numerical Analysis, 43(1):1–43, 2023

Siddhartha Mishra and Roberto Molinaro. Estimates for generalization error of physics-informed neural networks for approximating pdes.IMA Journal of Numerical Analysis, 43(1):1–43, 2023

work page 2023

[21] [21]

MIT press, 2nd edition, 2018

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.Foundations of Machine Learning. MIT press, 2nd edition, 2018

work page 2018

[22] [22]

A PAC-bayesian approach to spectrally-normalized margin bounds for neural networks

Behnam Neyshabur, Srinadh Bhojanapalli, and Nathan Srebro. A PAC-bayesian approach to spectrally-normalized margin bounds for neural networks. InProceedings of the 6th International Conference on Learning Representations (ICLR), 2018

work page 2018

[23] [23]

Norm-based capacity control in neural networks

Behnam Neyshabur, Ryota Tomioka, and Nathan Srebro. Norm-based capacity control in neural networks. In Proceedings of the 2015 Conference on Learning Theory (COLT), 2015

work page 2015

[24] [24]

Maziar Raissi, Paris Perdikaris, and George Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378:686–707, 2019

work page 2019

[25] [25]

Physics-informed neural networks: A review of methodological evolution, theoretical foundations, and interdisciplinary frontiers toward next-generation scientific computing

Zhiyuan Ren, Shijie Zhou, Dong Liu, and Qihe Liu. Physics-informed neural networks: A review of methodological evolution, theoretical foundations, and interdisciplinary frontiers toward next-generation scientific computing. Applied Sciences, 15(14):8092, 2025

work page 2025

[26] [26]

Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

Taiji Suzuki, Hiroshi Abe, and Tomoaki Nishimura. Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network. InProceedings of the 8th International Conference on Learning Representations (ICLR), 2020

work page 2020

[27] [27]

Data-dependent sample complexity of deep neural networks via Lipschitz augmentation

Colin Wei and Tengyu Ma. Data-dependent sample complexity of deep neural networks via Lipschitz augmentation. InProceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS), 2019

work page 2019

[28] [28]

Improved sample complexities for deep neural networks and robust classification via an all-layer margin

Colin Wei and Tengyu Ma. Improved sample complexities for deep neural networks and robust classification via an all-layer margin. InProceedings of the 8th International Conference on Learning Representations (ICLR), 2020

work page 2020

[29] [29]

The Barron space and the flow-induced function spaces for neural network models

Weinan E, Chao Ma, and Lei Wu. The Barron space and the flow-induced function spaces for neural network models. Constructive Approximation, 55:369–406, 2022

work page 2022

[30] [30]

Refined generalization analysis of the deep ritz method and physics- informed neural networks

Xianliang Xu, Ye Li, and Zhongyi Huang. Refined generalization analysis of the deep ritz method and physics- informed neural networks. InProceedings of the 42nd International Conference on Machine Learning (ICML), 2025. 11 Appendix A Proofs Proof of Theorem 8By the Cauchy–Schwartz inequality and the Jensen’s inequality, we have E sup uθ∈UΘ 1 N NX n=1 ⟨pxn...

work page 2025