pith. sign in

arxiv: 2601.00473 · v3 · pith:V7MGSVWXnew · submitted 2026-01-01 · 💻 cs.LG · cs.AI

Deep Neural Networks as Discrete Dynamical Systems: Implications for Physics-Informed Learning

Pith reviewed 2026-05-21 16:20 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords physics-informed neural networksdiscrete dynamical systemsBurgers equationEikonal equationPINNsPDE approximationneural integral equationsattractor dynamics
0
0 comments X

The pith

Physics-informed neural networks approximate the same PDE dynamics as traditional discretization but through a different pathway of dense learned parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper revisits the view of feed-forward deep neural networks as discrete dynamical systems that arise from neural integral equations and their PDE equivalents. It compares the outputs of physics-informed neural networks against exact and numerical solutions for the Burgers and Eikonal equations to illustrate that both routes capture essentially the same underlying dynamics. The analysis highlights that networks evolve layer by layer toward attractors, so that many different parameter sets can produce comparable results because the inverse mapping is degenerate. In place of the structured stencils of finite-difference methods, the networks learn dense representations that carry more parameters and lower direct interpretability yet may scale better when classical grids become impractical.

Core claim

Within this framework, DNNs can be interpreted as discrete dynamical systems whose layer-wise evolution approaches attractors, and multiple parameter configurations may yield comparable solutions, reflecting the degeneracy of the inverse mapping. In contrast to the structured operators associated with finite-difference procedures, PINNs learn dense parameter representations that are not directly associated with classical discretization stencils. This distributed representation generally involves a larger number of parameters, leading to reduced interpretability and increased computational cost. However, the additional flexibility of such representations may offer advantages in high-

What carries the argument

The analogy between feed-forward DNNs and discrete dynamical systems obtained from neural integral equations and their corresponding PDE forms, in which successive layers drive the state toward attractors.

If this is right

  • PINN solutions match those obtained by standard numerical discretization for the Burgers and Eikonal equations.
  • The learned representations use dense parameter sets rather than the sparse stencils of finite-difference schemes.
  • The larger parameter count reduces direct interpretability compared with classical discretization operators.
  • The extra flexibility may become useful for high-dimensional problems where grid-based methods are impractical.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The attractor picture could be tested by tracking how the network state changes when depth is increased while output accuracy is held fixed.
  • The same degeneracy argument might suggest that ensemble averaging over several trained networks could reduce sensitivity to initialization without extra regularization.

Load-bearing premise

Feed-forward networks can be treated as discrete dynamical systems whose successive layers evolve toward attractors, so that many parameter choices produce essentially the same output.

What would settle it

Train a PINN on the Burgers equation, then inspect the hidden-layer activations across many random initializations to check whether they converge to a common attractor independent of the specific weights chosen.

Figures

Figures reproduced from arXiv: 2601.00473 by Abhisek Ganguly, Santosh Ansumali, Sauro Succi.

Figure 1
Figure 1. Figure 1: The basic blocks of the DNN architecture. The procedure is repeated [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of PINN solutions for the 1D Burgers’ equation. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The weight matrices for layers 1,4 and 7 of the PINN solution of the [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The singular value (σ) representation for layers 1,4 and 7 of the PINN solution of the (top) viscous and (bottom) inviscid Burgers’ equation for two separate, independent runs. The spectral representations highlight structural differences (or similarities) in the trained weight matrices across the runs. to a minimal. At this stage, we define the norms over n test points as follows, L1 = ∥e∥1 = 1 n Xn i=1 |… view at source ↗
Figure 5
Figure 5. Figure 5: The figure shows the evolution of a 2D input signal as it propagates [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The figure compares the solutions obtained by two PINN runs for the [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The figure shows shows the heatmap of the activation values of each [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
read the original abstract

We revisit the analogy between feed-forward deep neural networks (DNNs) and discrete dynamical systems derived from neural integral equations and their corresponding partial differential equation (PDE) forms. A comparative analysis between the numerical/exact solutions of the Burgers' and Eikonal equations, and the same obtained via PINNs is presented. We show that PINN learning provides a different computational pathway compared to standard numerical discretization in approximating essentially the same underlying dynamics of the system. Within this framework, DNNs can be interpreted as discrete dynamical systems whose layer-wise evolution approaches attractors, and multiple parameter configurations may yield comparable solutions, reflecting the degeneracy of the inverse mapping. In contrast to the structured operators associated with finite-difference (FD) procedures, PINNs learn dense parameter representations that are not directly associated with classical discretization stencils. This distributed representation generally involves a larger number of parameters, leading to reduced interpretability and increased computational cost. However, the additional flexibility of such representations may offer advantages in high-dimensional settings where classical grid-based methods become impractical.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper revisits the analogy between feed-forward deep neural networks (DNNs) and discrete dynamical systems derived from neural integral equations and their PDE forms. It presents a comparative analysis of numerical/exact solutions versus PINN solutions for the Burgers' and Eikonal equations, claiming that PINN learning offers a different computational pathway to approximate the same underlying dynamics. The authors interpret DNN layers as discrete dynamical systems whose evolution approaches attractors, with multiple parameter configurations yielding comparable solutions due to degeneracy of the inverse mapping; they contrast this dense learned representation with the structured operators of finite-difference discretizations, noting trade-offs in interpretability and cost but potential advantages in high dimensions.

Significance. If the central claims are rigorously verified, the work could offer a useful conceptual bridge between PINN training dynamics and classical dynamical systems theory, potentially explaining convergence behavior and suggesting new ways to analyze or regularize PINNs in high-dimensional settings. The emphasis on degeneracy and attractor-like evolution might inform practical choices in network architecture or initialization. However, the current lack of quantitative support and explicit constructions limits the strength of these implications.

major comments (3)
  1. [Comparative analysis of Burgers' and Eikonal equations] Comparative analysis section (as described in the abstract and results): the manuscript provides no quantitative error metrics, convergence rates, or explicit verification details for the attractor or degeneracy claims, leaving the assertion that PINNs approximate the same underlying dynamics only weakly supported by the presented text.
  2. [DNN as discrete dynamical systems] Framework section on DNNs as discrete dynamical systems: the layer-wise evolution approaching attractors is asserted via neural integral equations, but no explicit construction of the layer-to-layer map is given for the trained PINN on the Burgers/Eikonal cases, nor is there verification that iterating those maps reproduces the PDE flow or quantification of attractor convergence independent of the PDE residual loss.
  3. [Degeneracy of the inverse mapping] Discussion of degeneracy in the inverse mapping: this property is framed in terms of fitted network weights after training, which introduces circularity because the claimed degeneracy is observed post-training rather than derived independently of the fitting process or tested via parameter perturbations or multiple initializations.
minor comments (2)
  1. [Abstract] The abstract would be strengthened by briefly indicating the specific quantitative comparisons or verification methods used for the attractor and degeneracy claims.
  2. [Framework] Notation for the neural integral equation and its discrete dynamical system interpretation could be clarified with an explicit equation reference to the layer-to-layer transition map.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive feedback. We address each major comment below, indicating revisions to strengthen the quantitative support and explicit constructions in the manuscript.

read point-by-point responses
  1. Referee: Comparative analysis section (as described in the abstract and results): the manuscript provides no quantitative error metrics, convergence rates, or explicit verification details for the attractor or degeneracy claims, leaving the assertion that PINNs approximate the same underlying dynamics only weakly supported by the presented text.

    Authors: We agree that quantitative metrics are needed to support the claims. In the revised version, we will add L2 error norms between PINN solutions and reference solutions for both the Burgers' and Eikonal equations, along with convergence rates as a function of network depth to verify approximation of the dynamics. revision: yes

  2. Referee: Framework section on DNNs as discrete dynamical systems: the layer-wise evolution approaching attractors is asserted via neural integral equations, but no explicit construction of the layer-to-layer map is given for the trained PINN on the Burgers/Eikonal cases, nor is there verification that iterating those maps reproduces the PDE flow or quantification of attractor convergence independent of the PDE residual loss.

    Authors: We will include an explicit derivation of the layer-to-layer map from the neural integral equation applied to the trained PINN architectures for the example PDEs. We will also add numerical verification by iterating the fixed map from initial conditions and quantifying convergence to the attractor, separate from the training residual. revision: yes

  3. Referee: Discussion of degeneracy in the inverse mapping: this property is framed in terms of fitted network weights after training, which introduces circularity because the claimed degeneracy is observed post-training rather than derived independently of the fitting process or tested via parameter perturbations or multiple initializations.

    Authors: We will first derive the degeneracy from the underdetermined inverse problem inherent in the neural integral equation framework, independent of training. We will supplement this with empirical results from multiple random initializations showing different parameter sets yielding comparable solutions. revision: partial

Circularity Check

0 steps flagged

No circularity: derivation relies on external numerical benchmarks and prior analogy without self-referential reduction

full rationale

The paper revisits the established analogy between feed-forward DNNs and discrete dynamical systems originating from neural integral equations, then directly compares PINN outputs against exact/numerical solutions of the Burgers' and Eikonal equations. The statements on layer-wise evolution approaching attractors and degeneracy of the inverse mapping are presented as interpretive consequences within that framework rather than as predictions or results derived from the fitted weights themselves. No equation or claim reduces by construction to a fitted parameter, self-citation chain, or input data; the central claim that PINNs offer a different computational pathway is supported by explicit comparative results and remains falsifiable against independent discretizations.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the dynamical-systems analogy and the observation of degeneracy after training; no new physical entities are introduced and the only free parameters are the network weights learned during PINN optimization.

free parameters (1)
  • PINN network weights and biases
    These are the dense learned parameters that define the distributed representation and are fitted to satisfy the PDE residual.
axioms (1)
  • domain assumption Feed-forward deep neural networks can be modeled as discrete dynamical systems derived from neural integral equations and their PDE forms.
    This foundational analogy is invoked at the start of the abstract and underpins the entire comparison.

pith-pipeline@v0.9.0 · 5717 in / 1398 out tokens · 50523 ms · 2026-05-21T16:20:21.442547+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    Raissi, P

    M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neu- ral networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Com- putational Physics, 378:686–707, 2019

  2. [2]

    Karniadakis, Ioannis G

    George E. Karniadakis, Ioannis G. Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning.Nature Reviews Physics, 3:422–440, 2021

  3. [3]

    Stiff-pinn: Physics-informed neural network for stiff chemical kinetics.The Journal of Physical Chemistry A, 125(36):8098–8106, 2021

    Weiqi Ji, Weilun Qiu, Zhiyu Shi, Shaowu Pan, and Sili Deng. Stiff-pinn: Physics-informed neural network for stiff chemical kinetics.The Journal of Physical Chemistry A, 125(36):8098–8106, 2021

  4. [4]

    Achondo, Jehanzeb H

    Mart´ ın A. Achondo, Jehanzeb H. Chaudhry, and Christopher D. Cooper. An investigation of physics informed neural networks to solve the poisson– boltzmann equation in molecular electrostatics.Journal of Chemical The- ory and Computation, 21(7):3726–3744, 2025

  5. [5]

    Neu- ral operator: learning maps between function spaces with applications to pdes.Journal of Machine Learning Research, 24(1), 2023

    Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neu- ral operator: learning maps between function spaces with applications to pdes.Journal of Machine Learning Research, 24(1), 2023

  6. [6]

    Neural integral equations.arXiv preprint arXiv:2209.15190, 2022

    Giorgio Zappal` a, Nathan A Fonseca, Patrick Kidger, James Mor- rill, and Ivan Dokmanic. Neural integral equations.arXiv preprint arXiv:2209.15190, 2022

  7. [7]

    Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Du- venaud. Neural ordinary differential equations. InAdvances in Neural Information Processing Systems (NeurIPS), volume 31, 2018

  8. [8]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), page 770–778, 2016

  9. [9]

    LeVeque.Finite Difference Methods for Ordinary and Partial Differential Equations

    Randall J. LeVeque.Finite Difference Methods for Ordinary and Partial Differential Equations. SIAM, 2007

  10. [10]

    A proposal on machine learning via dynamical systems.Com- munications in Mathematics and Statistics, 5(1):1–11, 2017

    Weinan E. A proposal on machine learning via dynamical systems.Com- munications in Mathematics and Statistics, 5(1):1–11, 2017

  11. [11]

    Approximation by superpositions of a sigmoidal function

    George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2(4):303–314, 1989. 16

  12. [12]

    Multilayer feed- forward networks are universal approximators.Neural Networks, 2(5):359– 366, 1989

    Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feed- forward networks are universal approximators.Neural Networks, 2(5):359– 366, 1989

  13. [13]

    Approximation capabilities of multilayer feedforward net- works.Neural Networks, 4(2):251–257, 1991

    Kurt Hornik. Approximation capabilities of multilayer feedforward net- works.Neural Networks, 4(2):251–257, 1991

  14. [14]

    Deep learning.Nature, 521(7553):436–444, 2015

    Yann Lecun, Yoshua Bengio, and Geoffrey Hinton. Deep learning.Nature, 521(7553):436–444, 2015

  15. [15]

    Randomness and signal propagation in physics-informed neural networks (pinns): a neural pde perspective.The European Physical Journal Plus, 141(3):321, 2026

    Jean-Michel Tucny, Abhisek Ganguly, Santosh Ansumali, and Sauro Succi. Randomness and signal propagation in physics-informed neural networks (pinns): a neural pde perspective.The European Physical Journal Plus, 141(3):321, 2026

  16. [16]

    Zico Kolter, and Vladlen Koltun

    Shaojie Bai, J. Zico Kolter, and Vladlen Koltun. Deep equilibrium models. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch´ e-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019

  17. [17]

    Stable architectures for deep neural net- works.Inverse Problems, 2017

    Eldad Haber and Lars Ruthotto. Stable architectures for deep neural net- works.Inverse Problems, 2017

  18. [18]

    S. Succi. A note on the physical interpretation of neural pdes.Mathematics and Mechanics of Complex Systems, 13(3):275–286, 2025

  19. [19]

    Uriel Frisch.Turbulence: The Legacy of A. N. Kolmogorov. Cambridge University Press, 1995

  20. [20]

    Law.Combustion Physics

    Chung K. Law.Combustion Physics. Cambridge University Press, 2006

  21. [21]

    Oxford University Press, 1975

    John Crank.The Mathematics of Diffusion. Oxford University Press, 1975

  22. [22]

    Whitham.Linear and Nonlinear Waves

    Gerald B. Whitham.Linear and Nonlinear Waves. Wiley, 1974

  23. [23]

    LeVeque.Finite Volume Methods for Hyperbolic Problems

    Randall J. LeVeque.Finite Volume Methods for Hyperbolic Problems. Cam- bridge University Press, 2002

  24. [24]

    Understanding and mit- igating gradient pathologies in physics-informed neural networks.SIAM Journal on Scientific Computing, 43(5):A3055–A3081, 2021

    Sifan Wang, Yujun Teng, and Paris Perdikaris. Understanding and mit- igating gradient pathologies in physics-informed neural networks.SIAM Journal on Scientific Computing, 43(5):A3055–A3081, 2021

  25. [25]

    Aditi Krishnapriyan, Amir Gholami, Shandian Zhe, Robert Kirby, and Michael W. Mahoney. Characterizing possible failure modes in physics- informed neural networks. InAdvances in Neural Information Processing Systems (NeurIPS), volume 34, pages 26548–26560, 2021

  26. [26]

    J. A. Sethian. Eikonal equation and its applications in physics.Physical Review E, 47:3761–3768, 1993

  27. [27]

    J. A. Sethian.Level Set Methods and Fast Marching Methods. Cambridge University Press, 1999

  28. [28]

    Thermodynamic and kinetic formulation via hamilton– jacobi theory.The Journal of Physical Chemistry A, 104(20):4690–4697, 2000

    George Ruppeiner. Thermodynamic and kinetic formulation via hamilton– jacobi theory.The Journal of Physical Chemistry A, 104(20):4690–4697, 2000

  29. [29]

    Jan Blechschmidt and Oliver G. Ernst. Three ways to solve partial differ- ential equations with neural networks — a review.GAMM-Mitteilungen, 44(2):e202100006, 2021

  30. [30]

    A. J. Davies et al. Decoding specialised feature neurons in large language models.arXiv preprint arXiv:2501.02688, 2025. 17

  31. [31]

    Arjun, Christian Leitold, Christoph Dellago, Peter G

    Hendrik Jung, Roberto Covino, A. Arjun, Christian Leitold, Christoph Dellago, Peter G. Bolhuis, and Gerhard Hummer. Machine-guided path sampling to discover mechanisms of molecular self-organization.Nature Computational Science, 3(4):334–345, 2023. 18