Model reduction of parametric ordinary differential equations via autoencoders: representation properties and convergence analysis
Pith reviewed 2026-05-18 13:49 UTC · model grok-4.3
The pith
Autoencoders with exact representation properties yield convergent reduced-order models for parametric nonlinear ODEs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By training autoencoders that possess exact representation capabilities for the input manifold, the high-dimensional parametric ODE can be replaced by a low-dimensional ODE whose solutions, when reconstructed, converge to the solutions of the original system; the convergence holds under standard assumptions on the network approximation error and the time integrator.
What carries the argument
Autoencoder neural networks engineered for exact representation of the solution manifold, which map the original ODE to a low-dimensional surrogate while preserving the structure needed for convergence proofs.
If this is right
- Standard time integrators applied to the low-dimensional ODE produce trajectories whose lift to the original space approximates the high-fidelity solution.
- The stability properties of the reconstructed solution are directly inherited from or modified in a quantifiable way by the reduction step.
- The method remains accurate for strongly nonlinear and parametric regimes where linear projection techniques typically degrade.
- The same autoencoder-based reduction framework can be reused across multiple parameter values once the network is trained.
Where Pith is reading between the lines
- The same representation-property approach could be tested on systems whose solution manifolds are known to be low-dimensional, such as certain chemical kinetics models.
- If the exact-representation condition can be relaxed to controlled approximation error, the framework would apply to noisy or incomplete data sets.
- Extending the convergence analysis to include discretization error in both space and time would connect the method to existing finite-element reduced-basis theory.
Load-bearing premise
The autoencoder can be chosen and trained so that it reconstructs the input manifold with exact representation capabilities.
What would settle it
A family of test problems in which the reconstructed high-dimensional solution diverges from the full-order reference even when the autoencoder reconstruction error is driven to zero would disprove the convergence claim.
Figures
read the original abstract
We propose a reduced-order modeling approach for nonlinear, parameter-dependent ordinary differential equations (ODE). Dimensionality reduction is achieved using nonlinear maps represented by autoencoders. The resulting low-dimensional ODE is then solved using standard integration in time schemes, and the high-dimensional solution is reconstructed from the low-dimensional one. We investigate the architecture of neural networks for constructing effective autoencoders that hold necessary properties to reconstruct the input manifold with exact representation capabilities. We study the convergence of the reduced-order model to the high-fidelity one. Numerical experiments show the robustness and accuracy of our approach in different scenarios, highlighting its effectiveness in highly complex and nonlinear settings without sacrificing accuracy. Moreover, we examine how the reduction influences the stability properties of the reconstructed high-dimensional solution.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a reduced-order modeling method for nonlinear parametric ODEs that uses autoencoders to perform nonlinear dimensionality reduction. The resulting low-dimensional ODE is integrated with standard time-stepping schemes and the high-dimensional solution is reconstructed via the decoder. The authors investigate autoencoder architectures that achieve exact representation of the solution manifold, derive a convergence analysis of the reduced model to the full-order parametric ODE, and present numerical experiments demonstrating accuracy and robustness in complex nonlinear regimes. They additionally examine how the reduction affects stability properties of the reconstructed trajectories.
Significance. If the convergence result can be extended to account for finite training and approximation errors, the work would supply a theoretically grounded nonlinear alternative to classical projection-based reduction for parametric ODEs. The combination of representation-property analysis, convergence statements, and stability examination is a constructive contribution; the numerical examples already illustrate practical utility in highly nonlinear settings.
major comments (2)
- [Convergence analysis] Convergence analysis (following the architecture discussion): the argument requires the autoencoder to possess exact representation properties so that the reduced ODE converges to the high-fidelity parametric system. No explicit error-propagation estimate is supplied that bounds the effect of residual training error, finite network width/depth, or parametric sampling on the Lipschitz constants or stability margins of the reconstructed high-dimensional trajectory. This assumption is load-bearing for the central convergence claim.
- [Architecture investigation] Architecture section: while the paper studies network designs that enable exact manifold reconstruction, it does not detail how the parametric dependence of the ODE is incorporated into the training loss or sampling strategy to guarantee that the representation properties hold uniformly over the parameter domain. Without such uniformity the subsequent convergence and stability statements may not transfer directly to the parametric setting.
minor comments (2)
- Define the precise norms and function spaces used for the error bounds and stability analysis to make the theoretical statements unambiguous.
- Clarify whether the reported numerical experiments employ the same autoencoder weights for all parameter values or retrain per parameter; this affects interpretation of the robustness claims.
Simulated Author's Rebuttal
We thank the referee for the positive overall assessment and the detailed, constructive comments. We address each major comment below, indicating the revisions we will make.
read point-by-point responses
-
Referee: [Convergence analysis] Convergence analysis (following the architecture discussion): the argument requires the autoencoder to possess exact representation properties so that the reduced ODE converges to the high-fidelity parametric system. No explicit error-propagation estimate is supplied that bounds the effect of residual training error, finite network width/depth, or parametric sampling on the Lipschitz constants or stability margins of the reconstructed high-dimensional trajectory. This assumption is load-bearing for the central convergence claim.
Authors: We appreciate the referee highlighting this point. The convergence theorem is derived under the exact-representation assumption established in the preceding architecture analysis; this permits a clean statement that the reduced ODE converges to the full-order parametric system in the limit of increasing latent dimension. We agree that the absence of an explicit error-propagation bound for residual training error, finite network capacity, or sampling density constitutes a limitation of the current analysis. Deriving such bounds would require additional technical machinery from approximation theory and would lengthen the paper considerably. In the revised manuscript we will add a short subsection that (i) explicitly states the ideal-case nature of the result, (ii) discusses the practical implications of approximate autoencoders, and (iii) sketches a possible route toward quantitative error estimates under standard Lipschitz and boundedness assumptions on the network. This constitutes a partial revision. revision: partial
-
Referee: [Architecture investigation] Architecture section: while the paper studies network designs that enable exact manifold reconstruction, it does not detail how the parametric dependence of the ODE is incorporated into the training loss or sampling strategy to guarantee that the representation properties hold uniformly over the parameter domain. Without such uniformity the subsequent convergence and stability statements may not transfer directly to the parametric setting.
Authors: We thank the referee for this observation. In the current manuscript the training data consist of solution snapshots generated for a dense, uniform grid of parameter values drawn from the admissible parameter domain; the autoencoder is trained with the standard mean-squared reconstruction loss evaluated on these parametric snapshots. This procedure is intended to promote uniform representation properties across the parameter range. We acknowledge that the manuscript does not spell out the sampling density or the precise manner in which uniformity is enforced. In the revised version we will expand the architecture section with an explicit description of the parameter-sampling strategy, the construction of the training set, and a brief argument why the resulting representation properties transfer to the full parameter domain, thereby supporting the subsequent convergence and stability claims. revision: yes
Circularity Check
No significant circularity; convergence analysis rests on explicit assumption of exact autoencoder reconstruction.
full rationale
The paper states its convergence results under the assumption that the autoencoder is trained to possess exact representation properties for the solution manifold, then analyzes the reduced ODE under that hypothesis using standard numerical integration. This is a forward assumption rather than a self-referential definition or fitted quantity renamed as prediction. No load-bearing step reduces by construction to the same data or prior self-citation chain; the architecture investigation and stability examination remain independent of the target convergence claim. The numerical experiments provide external validation in nonlinear regimes, confirming the derivation chain is self-contained.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Autoencoders can be constructed to possess exact representation capabilities for the solution manifold of the parametric ODE.
- standard math Standard time-integration schemes applied to the reduced ODE produce solutions that converge to the high-fidelity solution under suitable conditions.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We investigate the architecture of neural networks for constructing effective autoencoders that hold necessary properties to reconstruct the input manifold with exact representation capabilities... Theorem 6. Global convergence... max_μ ∥uN − UN∥∞ ≤ LΨ(εΨ′(W) + εFn(Δttrain,W)T)e^{LFn T} + ...
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 3 (a-manifold)... the set {uN(t;μ)} is an a-manifold... minimum latent dimension n ≤ 2a+1 (Menger-Nöbeling)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
In Proceedings of the Conference Algoritmy, pages 1–12, 2016
Reduced basis methods: success, limitations and future challenges. In Proceedings of the Conference Algoritmy, pages 1–12, 2016
work page 2016
-
[2]
J. Ansel, E. Yang, H. He, et al. PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation. In 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (ASPLOS ’24). ACM, 2024
work page 2024
-
[3]
E. Ballini, A. Cominelli, L. Dovera, A. Forello, L. Formaggia, A. Fumagalli, S. Nardean, A. Scotti, and P. Zunino. Enhancing computational efficiency of numerical simulation for subsurface fluid-induced deformation using deep learning reduced order models. In SPE Reservoir Simulation Conference, 25RSC. SPE, Mar. 2025
work page 2025
-
[4]
J. C. Butcher. Numerical Methods for Ordinary Differential Equations. Wiley, July 2016
work page 2016
- [5]
-
[6]
P. Y. Chen, J. Xiang, D. H. Cho, Y. Chang, G. A. Pershing, H. T. Maia, M. M. Chiaramonte, K. T. Carlberg, and E. Grinspun. CROM: Continuous reduced-order modeling of PDEs using implicit neural representations. In The Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[7]
R. T. Q. Chen, B. Amos, and M. Nickel. Learning neural event functions for ordinary differential equations
- [8]
-
[9]
R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud. Neural ordinary differential equations. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018
work page 2018
- [10]
-
[11]
Q. Du, Y. Gu, H. Yang, and C. Zhou. The discovery of dynamics via linear multistep methods and deep learning: Error estimation. SIAM Journal on Numerical Analysis, 60(4):2014–2045, 2022
work page 2014
-
[12]
R. D’Ambrosio and S. D. Giovacchino. Mean-square contractivity of stochastic theta-methods.Communications in Nonlinear Science and Numerical Simulation, 96:105671, May 2021
work page 2021
-
[13]
N. Farenga, S. Fresca, S. Brivio, and A. Manzoni. On latent dynamics learning in nonlinear reduced order modeling. Neural Networks, 185:107146, May 2025
work page 2025
-
[14]
D. Floryan and M. D. Graham. Data-driven discovery of intrinsic dynamics. Nature Machine Intelligence, 4(12):1113–1120, Dec. 2022
work page 2022
-
[15]
N. R. Franco, S. Fresca, F. Tombari, and A. Manzoni. Deep learning-based surrogate models for parametrized pdes: Handling geometric variability through graph neural networks. Chaos: An Interdisciplinary Journal of Nonlinear Science, 33(12), 2023
work page 2023
-
[16]
N. R. Franco, A. Manzoni, and P. Zunino. A deep learning approach to reduced order modelling of parameter dependent partial differential equations. Mathematics of Computation, 92(340):483–524, 2022
work page 2022
-
[17]
N. R. Franco, A. Manzoni, and P. Zunino. Mesh-informed neural networks for operator learning in finite element spaces. Journal of Scientific Computing, 97(2), 2023
work page 2023
- [18]
- [19]
-
[20]
S. Fresca and A. Manzoni. POD-DL-ROM: Enhancing deep learning-based reduced order models for nonlin- ear parametrized PDEs by proper orthogonal decomposition. Computer Methods in Applied Mechanics and Engineering, 388:114181, 2022
work page 2022
-
[21]
R. Fu, D. Xiao, I. Navon, F. Fang, L. Yang, C. Wang, and S. Cheng. A non-linear non-intrusive reduced order model of fluid flow by auto-encoder and self-attention deep learning methods. International Journal for Numerical Methods in Engineering, 124(13):3087–3111, 2023
work page 2023
-
[22]
A. Fumagalli and A. Scotti. A mathematical model for thermal single-phase flow and reactive transport in fractured porous media. Journal of Computational Physics, 434:110205, 2021
work page 2021
-
[23]
F. J. Gonzalez and M. Balajewicz. Deep convolutional recurrent autoencoders for learning low-dimensional feature dynamics of fluid systems. arXiv:1808.01346, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[24]
I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, 2017
work page 2017
-
[25]
R. Gupta and R. Jaiman. Three-dimensional deep learning-based reduced order model for unsteady flow dynamics with variable reynolds number. Physics of Fluids, 34(3), 2022
work page 2022
-
[26]
I. G¨ uhring and M. Raslan. Approximation rates for neural networks with encodable weights in smoothness spaces. Neural Networks, 134:107–130, Feb. 2021
work page 2021
-
[27]
E. Hairer and G. Wanner. Solving Ordinary Differential Equations II. Springer Berlin Heidelberg, 1996. 30
work page 1996
-
[28]
Approximating Continuous Functions by ReLU Nets of Minimal Width
B. Hanin and M. Sellke. Approximating continuous functions by ReLU nets of minimal width.arXiv:1710.11278, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[29]
K. Hasegawa, K. Fukami, T. Murata, and K. Fukagata. Machine-learning-based reduced-order modeling for un- steady flows around bluff bodies of various shapes. Theoretical and Computational Fluid Dynamics, 34(4):367– 383, 2020
work page 2020
-
[30]
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016
work page 2016
-
[31]
J. S. Hesthaven, G. Rozza, and B. Stamm. Certified Reduced Basis Methods for Parametrized Partial Differential Equations. Springer International Publishing, 2016
work page 2016
-
[32]
T. Kadeethum, F. Ballarin, Y. Choi, D. O’Malley, H. Yoon, and N. Bouklas. Non-intrusive reduced order modeling of natural convection in porous media using convolutional autoencoders: Comparison with linear subspace techniques. Advances in Water Resources, 160:104098, 2022
work page 2022
-
[33]
P. Kidger. On Neural Differential Equations. Ph.D. thesis, University of Oxford, 2021
work page 2021
-
[34]
D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
- [35]
-
[36]
J. D. Lambert. Numerical methods for ordinary differential systems. Wiley, 1999
work page 1999
-
[37]
J. M. Lee. Introduction to Topological Manifolds. Springer New York, 2000
work page 2000
- [38]
-
[39]
C. Legaard, T. Schranz, G. Schweiger, J. Drgoˇ na, B. Falay, C. Gomes, A. Iosifidis, M. Abkar, and P. Larsen. Constructing neural network based models for simulating dynamical systems. ACM Computing Surveys, 55(11):1–34, Feb. 2023
work page 2023
-
[40]
F. Li, Y. Zhang, and C. Xiao. Surrogate-based hydraulic fracture propagation prediction using deep neural network proxy. In 58th U.S. Rock Mechanics/Geomechanics Symposium, ARMA24. ARMA, June 2024
work page 2024
- [41]
-
[42]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar. Fourier neural operator for parametric partial differential equations. arXiv:2010.08895, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[43]
Z. Li, S. Patil, F. Ogoke, D. Shu, W. Zhen, M. Schneier, J. R. Buchanan, and A. Barati Farimani. Latent neural pde solver: A reduced-order modeling framework for partial differential equations. Journal of Computational Physics, 524:113705, Mar. 2025
work page 2025
-
[44]
J. Lu, Z. Shen, H. Yang, and S. Zhang. Deep network approximation for smooth functions. SIAM Journal on Mathematical Analysis, 53(5):5465–5506, Jan. 2021
work page 2021
-
[45]
L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, Mar. 2021
work page 2021
-
[46]
Y. Lu, A. Zhong, Q. Li, and B. Dong. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In J. Dy and A. Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 3276–3285. PMLR, 10–15 Jul 2018
work page 2018
-
[47]
J. E. Marsden. Mathematical Foundations of Elasticity. Dover Civil and Mechanical Engineering. Dover Publications, Newburyport, 2012. Description based upon print version of record. 31
work page 2012
-
[48]
J. Munkres. Topology. Featured Titles for Topology. Prentice Hall, Incorporated, 2000
work page 2000
- [49]
-
[50]
J. M. Nordbotten and M. A. Celia. Geological storage of CO2 modeling approaches for large-scale simulation. Wiley, 2012
work page 2012
-
[51]
P. Pant, R. Doshi, P. Bahl, and A. Barati Farimani. Deep learning for reduced order modelling and efficient temporal evolution of fluid simulations. Physics of Fluids, 33(10), 2021
work page 2021
- [52]
-
[53]
A. Quarteroni, A. Manzoni, and F. Negri. Reduced Basis Methods for Partial Differential Equations. Springer International Publishing, 2016
work page 2016
-
[54]
A. Quarteroni, R. Sacco, and F. Saleri. Numerical Mathematics. Number 37 in Texts in Applied Mathematics. Springer, Berlin, second edition edition, 2007. Literaturverzeichnis: Seite 635-644
work page 2007
- [55]
-
[56]
F. Regazzoni, S. Pagani, M. Salvador, L. Dede’, and A. Quarteroni. Learning the intrinsic dynamics of spatio- temporal processes through latent dynamics networks. Nature Communications, 15(1), Feb. 2024
work page 2024
-
[57]
O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional Networks for Biomedical Image Segmentation, pages 234–241. Springer International Publishing, 2015
work page 2015
-
[58]
Y. Rubanova, R. T. Q. Chen, and D. Duvenaud. Latent ODEs for irregularly-sampled time series. Curran Associates Inc., Red Hook, NY, USA, 2019
work page 2019
-
[59]
A. Sabzi Shahrebabaki, S. Fouladi, E. Holtar, and L. Vynnytska. Autoencoder-based generation of subsurface models. In Second EAGE Digitalization Conference and Exhibition, pages 1–5. European Association of Geoscientists & Engineers, 2022
work page 2022
-
[60]
C. I. Steefel and A. C. Lasaga. A coupled model for transport of multiple chemical species and kinetic precipitation/dissolution reactions with application to reactive flow in single phase hydrothermal systems. American Journal of Science, 294(5):529–592, May 1994
work page 1994
-
[61]
P. A. Thompson. Compressible-Fluid Dynamics. Advanced engineering series. McGraw-Hill, Inc., 1988
work page 1988
-
[62]
T. Tripura and S. Chakraborty. Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems. Computer Methods in Applied Mechanics and Engineering, 404:115783, Feb. 2023
work page 2023
-
[63]
P. Virtanen, R. Gommers, T. E. Oliphant, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods, 17(3):261–272, 2020
work page 2020
-
[64]
G. Ziarelli, N. Parolini, and M. Verani. Learning epidemic trajectories through kernel operator learning: From modelling to optimal control. Numerical Mathematics: Theory, Methods and Applications, Apr. 2025. 32 Appendix A Lipschitz constants and stability Lipschitz constantL Fn.LetM y(x) = supx ∥y(x)∥andL y the Lipschitz constant of the generic functiony...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.