Artifacts of Numerical Integration in Learning Dynamical Systems
Pith reviewed 2026-05-19 04:29 UTC · model grok-4.3
The pith
The stability region of a numerical integrator can distort learned dynamical systems, turning a damped oscillator into an anti-damped one with reversed direction.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A damped oscillatory system may be incorrectly identified as having anti-damping and exhibiting a reversed oscillation direction, even though it adequately fits the given data points. This occurs because the stability region of the selected integrator distorts the nature of the learned dynamics. Reducing the step size or raising the order of an explicit integrator does not, in general, remedy the artifact.
What carries the argument
The stability region of the numerical integrator used inside the optimization to generate predicted trajectories from the learned model and measure mismatch with observations.
If this is right
- A damped oscillatory system can be misidentified as anti-damped with reversed oscillation direction while fitting the data.
- Raising the order or reducing the step size of an explicit integrator does not remove the artifact, because higher-order explicit methods have stability regions that extend farther into the right half-plane.
- The implicit midpoint method preserves conservative or dissipative properties from the discrete data for autonomous systems.
Where Pith is reading between the lines
- When the only prior information is that the system is autonomous, selecting an integrator whose stability properties match expected dissipation or conservation improves the chance of recovering correct qualitative behavior.
- The same stability-region mechanism can affect learned models in any setting where trajectories are simulated inside a data-fit objective.
Load-bearing premise
The learning procedure formulates an optimization problem that uses a numerical integrator to compute predicted trajectories and assess mismatch with observed data points, assuming the underlying system is autonomous.
What would settle it
Generate data from a known damped harmonic oscillator, run the learning optimization with the explicit Euler integrator, and check whether the recovered model has a positive damping coefficient together with reversed oscillation phase.
Figures
read the original abstract
In many applications, one needs to learn a dynamical system from its solutions sampled at a finite number of time points. The learning problem is often formulated as an optimization problem over a chosen function class. However, in the optimization procedure, prediction data from generic dynamics requires a numerical integrator to assess the mismatch with the observed data. This paper reveals potentially serious effects of a chosen numerical scheme on the learning outcome. Specifically, the analysis demonstrates that a damped oscillatory system may be incorrectly identified as having "anti-damping" and exhibiting a reversed oscillation direction, even though it adequately fits the given data points. This paper shows that the stability region of the selected integrator will distort the nature of the learned dynamics. Crucially, reducing the step size or raising the order of an explicit integrator does not, in general, remedy this artifact, because higher-order explicit methods have stability regions that extend further into the right half complex plane. Furthermore, it is shown that the implicit midpoint method can preserve either conservative or dissipative properties from discrete data, offering a principled integrator choice even when the only prior knowledge is that the system is autonomous.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that numerical integrators used inside trajectory-matching optimization for learning autonomous dynamical systems from discrete data can distort the recovered model via their stability regions. Explicit integrators (including higher-order ones) permit anti-damped or sign-reversed oscillatory models to fit damped data because their stability regions extend into the right half-plane; the implicit midpoint rule avoids this distortion and can preserve dissipativity or conservation properties.
Significance. If the central argument holds, the result identifies a concrete and previously under-appreciated mechanism by which standard numerical-analysis tools affect data-driven modeling. It supplies both a diagnostic (stability-region geometry) and a practical recommendation (implicit midpoint), which is directly relevant to physics-informed learning and system identification.
major comments (2)
- [§3.2] §3.2, the linear-stability argument: the claim that the artifact persists for any fixed h>0 when the stability region intersects the right half-plane is load-bearing; the manuscript should explicitly show that the optimization landscape admits a minimizer whose continuous-time eigenvalues lie outside the integrator’s stability region while the discrete trajectory still matches the data.
- [§4] §4, numerical counter-examples: the reported trajectories for the damped oscillator are convincing, but the paper should state whether the anti-damped model is recovered from multiple random initializations or only from a specific starting guess; otherwise the claim that the integrator “distorts the nature of the learned dynamics” rests on a single optimization path.
minor comments (2)
- [§2] The notation for the discrete map Φ_h in Eq. (7) should be introduced before it is used in the loss functional.
- [Figure 2] Figure 2: the stability-region plots would be clearer if the right half-plane were shaded and the imaginary axis labeled.
Simulated Author's Rebuttal
We thank the referee for the careful reading, the positive assessment of the work, and the recommendation for minor revision. The comments have helped us clarify and strengthen the presentation. We respond to each major comment below and indicate the corresponding revisions.
read point-by-point responses
-
Referee: §3.2, the linear-stability argument: the claim that the artifact persists for any fixed h>0 when the stability region intersects the right half-plane is load-bearing; the manuscript should explicitly show that the optimization landscape admits a minimizer whose continuous-time eigenvalues lie outside the integrator’s stability region while the discrete trajectory still matches the data.
Authors: We agree that an explicit demonstration of the existence of such a minimizer would make the linear-stability argument more self-contained. In the revised manuscript we have added a short subsection to §3.2 that treats the linear damped oscillator explicitly. For any fixed h>0 we construct the discrete trajectory produced by a generic explicit integrator whose stability region intersects the right half-plane and show that the continuous-time anti-damped parameters yield exactly the same discrete samples as the true damped system. Consequently the optimization objective attains the same value at both parameter sets, establishing that a minimizer with eigenvalues outside the stability region exists for every h>0. This addition does not alter the original claims but renders the argument fully rigorous. revision: yes
-
Referee: §4, numerical counter-examples: the reported trajectories for the damped oscillator are convincing, but the paper should state whether the anti-damped model is recovered from multiple random initializations or only from a specific starting guess; otherwise the claim that the integrator “distorts the nature of the learned dynamics” rests on a single optimization path.
Authors: We thank the referee for highlighting the need to document robustness with respect to initialization. In the revised §4 we now report results from 100 independent optimizations started from random initial guesses drawn from a standard normal distribution (scaled by a modest factor). For every explicit integrator considered, the anti-damped model is recovered in at least 85 % of the runs; the implicit midpoint rule recovers the original damped dynamics in all runs. We have added a brief description of the initialization procedure and the success statistics to the text and to the caption of the relevant figure. These additional experiments confirm that the observed distortion is not an artifact of a single optimization path. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper's central argument rests on the established geometry of stability regions for explicit Runge-Kutta and multistep integrators, which are standard results from numerical ODE theory and lie outside the paper's own optimization or fitted models. The demonstration that a damped oscillator can be misidentified as anti-damped follows directly from how those regions intersect the right half-plane, permitting discrete trajectory matches while the underlying continuous eigenvalues have the wrong sign; this is an analysis of possible artifacts rather than a quantity fitted or defined in terms of the learned dynamics. The contrast with the implicit midpoint rule similarly invokes its known preservation properties for autonomous systems, again independent of any self-referential definition or self-citation chain. No step in the derivation reduces by construction to a fitted input renamed as a prediction, nor does any load-bearing premise collapse to prior work by the same authors. The derivation is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The underlying dynamical system is autonomous.
Reference graph
Works this paper leans on
- [1]
-
[2]
T. Bertalan, F. Dietrich, I. Mezi´c, and I. G. Kevrekidis. On learning hamiltonian systems from data. Chaos: An Interdisciplinary Journal of Nonlinear Science, 29(12), 2019
work page 2019
-
[3]
S. L. Brunton, J. L. Proctor, and J. N. Kutz. Discovering governing equations from data by sparse identification of nonlinear dynamical systems.Proceedings of the national academy of sciences, 113(15):3932–3937, 2016
work page 2016
- [4]
-
[5]
R. T. Chen, Y . Rubanova, J. Bettencourt, and D. K. Duvenaud. Neural ordinary differential equations. Advances in neural information processing systems, 31, 2018
work page 2018
- [6]
-
[7]
G. Dahlquist, L. Edsberg, G. Sk ¨ollermo, and G. S ¨oderlind. Are the numerical methods and software satisfactory for chemical kinetics? In Numerical In- tegration of Differential Equations and Large Linear Systems: Proceedings of two Workshops Held at the University of Bielefeld Spring 1980, pages 149–164. Springer, 1982
work page 1980
-
[8]
F. Djeumou, C. Neary, E. Goubault, S. Putot, and U. Topcu. Neural net- works with physics-informed architectures and constraints for dynamical sys- tems modeling. In Learning for Dynamics and Control Conference, pages 263–
-
[9]
43 Numerical Artifacts in Learning Dynamical Systems
PMLR, 2022. 43 Numerical Artifacts in Learning Dynamical Systems
work page 2022
-
[10]
F. Djeumou, C. Neary, and U. Topcu. How to learn and generalize from three minutes of data: Physics-constrained and uncertainty-aware neural stochastic differential equations. arXiv preprint arXiv:2306.06335, 2023
-
[11]
Q. Du, Y . Gu, H. Yang, and C. Zhou. The discovery of dynamics via linear multistep methods and deep learning: error estimation. SIAM Journal on Nu- merical Analysis, 60(4):2014–2045, 2022
work page 2014
-
[12]
S. Greydanus, M. Dzamba, and J. Yosinski. Hamiltonian neural networks. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch´e-Buc, E. Fox, and R. Gar- nett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019
work page 2019
- [13]
-
[14]
L. S. T. Ho, H. Schaeffer, G. Tran, and R. Ward. Recovery guarantees for polynomial coefficients from weakly dependent data with outliers. Journal of Approximation Theory, 259:105472, 2020
work page 2020
-
[15]
P. Hu, W. Yang, Y . Zhu, and L. Hong. Revealing hidden dynamics from time- series data by odenet. Journal of Computational Physics, 461:111203, 2022
work page 2022
-
[16]
R. T. Keller and Q. Du. Discovery of dynamics using linear multistep methods. SIAM Journal on Numerical Analysis, 59(1):429–455, 2021
work page 2021
-
[17]
J. Z. Kolter and G. Manek. Learning stable deep dynamics models. In H. Wal- lach, H. Larochelle, A. Beygelzimer, F. d'Alch ´e-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Cur- ran Associates, Inc., 2019
work page 2019
-
[18]
B. O. Koopman. Hamiltonian systems and transformation in hilbert space. Pro- ceedings of the National Academy of Sciences, 17(5):315–318, 1931
work page 1931
-
[19]
M. Laurie and J. Lu. Explainable deep learning for tumor dynamic model- ing and overall survival prediction using neural-ode. npj Systems Biology and Applications, 9(1):58, 2023
work page 2023
-
[20]
J. Levinson, J. Askeland, J. Becker, J. Dolson, D. Held, S. Kammel, J. Z. Kolter, D. Langer, O. Pink, V . Pratt, et al. Towards fully autonomous driving: Systems and algorithms. In 2011 IEEE intelligent vehicles symposium (IV), pages 163–
work page 2011
-
[21]
44 Numerical Artifacts in Learning Dynamical Systems
IEEE, 2011. 44 Numerical Artifacts in Learning Dynamical Systems
work page 2011
-
[22]
J. Lu, K. Deng, X. Zhang, G. Liu, and Y . Guan. Neural-ode for pharmacoki- netics modeling and its advantage to alternative machine learning models in predicting new dosing regimens. Iscience, 24(7), 2021
work page 2021
-
[23]
H. Lucas and R. Kelley. Generating control policies for autonomous vehicles using neural odes. In ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations, 2019
work page 2020
- [24]
-
[25]
I. Mezi ´c. Spectral properties of dynamical systems, model reduction and de- compositions. Nonlinear Dynamics, 41:309–325, 2005
work page 2005
-
[26]
I. Mezi ´c and A. Banaszuk. Comparison of systems with complex behavior. Physica D: Nonlinear Phenomena, 197(1-2):101–133, 2004
work page 2004
-
[27]
C. Neary and U. Topcu. Compositional learning of dynamical system models using port-hamiltonian neural networks. InLearning for Dynamics and Control Conference, pages 679–691. PMLR, 2023
work page 2023
-
[28]
Y . Oussar and G. Dreyfus. How to be a gray box: dynamic semi-physical modeling. Neural networks, 14(9):1161–1172, 2001
work page 2001
-
[29]
T. Qin, K. Wu, and D. Xiu. Data driven governing equations approximation using deep neural networks. Journal of Computational Physics, 395:620–635, 2019
work page 2019
-
[30]
Multistep Neural Networks for Data-driven Discovery of Nonlinear Dynamical Systems
M. Raissi, P. Perdikaris, and G. E. Karniadakis. Multistep neural networks for data-driven discovery of nonlinear dynamical systems. arXiv preprint arXiv:1801.01236, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[31]
R. Rico-Martinez and I. G. Kevrekidis. Continuous time modeling of nonlinear systems: A neural network-based approach. In IEEE International Conference on Neural Networks, pages 1522–1525. IEEE, 1993
work page 1993
-
[32]
H. Schaeffer, G. Tran, and R. Ward. Extracting sparse high-dimensional dy- namics from limited data. SIAM Journal on Applied Mathematics, 78(6):3279– 3295, 2018
work page 2018
-
[33]
H. Schaeffer, G. Tran, R. Ward, and L. Zhang. Extracting structured dynamical systems using sparse optimization with very few samples.Multiscale Modeling & Simulation, 18(4):1435–1461, 2020. 45 Numerical Artifacts in Learning Dynamical Systems
work page 2020
-
[34]
S. Terakawa and T. Yaguchi. Modeling error and nonuniqueness of the continuous-time models learned via runge–kutta methods. Mathematics, 12(8):1190, 2024
work page 2024
-
[35]
P. R. Vlachas, W. Byeon, Z. Y . Wan, T. P. Sapsis, and P. Koumoutsakos. Data- driven forecasting of high-dimensional chaotic systems with long short-term memory networks. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 474(2213):20170844, 2018
work page 2018
-
[36]
C. Wehmeyer and F. No ´e. Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics. The Journal of chemical physics , 148(24), 2018
work page 2018
-
[37]
X. Xie, G. Zhang, and C. G. Webster. Non-intrusive inference reduced order model for fluids using deep multistep neural network. Mathematics, 7(8):757, 2019
work page 2019
- [38]
-
[39]
L. Zhang and H. Schaeffer. On the convergence of the sindy algorithm. Multi- scale Modeling & Simulation, 17(3):948–972, 2019
work page 2019
-
[40]
A. Zhu, P. Jin, B. Zhu, and Y . Tang. On numerical integration in neural ordinary differential equations. InInternational Conference on Machine Learning, pages 27527–27547. PMLR, 2022
work page 2022
-
[41]
A. Zhu, S. Wu, and Y . Tang. Error analysis based on inverse modified differ- ential equations for discovery of dynamics using linear multistep methods and deep learning. SIAM Journal on Numerical Analysis, 62(5):2087–2120, 2024. 6 Appendix 6.1 Coefficients of Linear Multistep Methods Here we provide a table that summarizes the coefficients of the Linear ...
work page 2087
-
[42]
For instance, while the Leap-Frog scheme is considered to generate predictions zn in (21), the learned dynamic ˆλh follows the following equation: ˆλh = e2λh − 1 2eλh = eλh − e−λh 2 , ⇒ ˆλh = λh + O(h3), ⇒ | ˆλ − λ| = O(h2). Theorem 6.2. The order of convergence ofˆλ to λ as h → 0 corresponds to the order of the linear multistep method used in generating ...
-
[43]
back into (41), we find that |eλh| = 8 9. Consequently, we conclude that if h satisifies |eλh| < 8 9, then all the roots of (40) contain negative real part. Next, we show that there is at least one solution that satisfies the following equation: 1 + z + z2 2! + z3 3! = eλh, |z| ≤ δ, where δ satisfies 6 − δ2 2δ ≥ √ 12.5. Applying Vieta’s formulas: z1 + z2 ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.