Geometric regularization of autoencoders via observed stochastic dynamics

Felix X.-F. Ye; Sean Hill

arxiv: 2604.16282 · v1 · submitted 2026-04-17 · 💻 cs.LG · math.DS· math.PR

Geometric regularization of autoencoders via observed stochastic dynamics

Sean Hill , Felix X.-F. Ye This is my paper

Pith reviewed 2026-05-10 09:07 UTC · model grok-4.3

classification 💻 cs.LG math.DSmath.PR

keywords autoencodersstochastic dynamical systemsmanifold learninggeometric regularizationmean first-passage timestangent bundlelatent SDEchart convergence

0 comments

The pith

Ambient covariance penalties let autoencoders learn charts whose errors propagate controllably to accurate stochastic dynamics and MFPTs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that the covariance observed in ambient space encodes tangent bundle geometry, which can be turned into penalties that regularize an autoencoder pipeline for learning both a nonlinear chart and the latent SDE. This addresses the problem of building reduced simulators for metastable systems where local-chart methods scale poorly and plain autoencoders leave geometry unconstrained. A bias decomposition shows systematic error in standard drift formulas for imperfect charts, motivating an encoder-pullback target from Itô's formula. Under W^{2,∞} convergence the chart errors control weak convergence of ambient dynamics and radial mean first-passage times, with experiments showing 50-70% MFPT error reduction and up to 10x lower coefficient errors.

Core claim

Observed ambient covariance Λ spans the tangent bundle in a coordinate-invariant manner. Penalties derived from it induce the ρ-metric on charts and, combined with an Itô-derived encoder target for drift, produce a three-stage learner for which W^{2,∞} chart convergence implies controllable propagation to weak ambient dynamics convergence and radial MFPT convergence, achieving the lowest inter-well MFPT errors on most tested pairs and order-of-magnitude coefficient improvements.

What carries the argument

The ρ-metric on the space of charts, induced by tangent-bundle and inverse-consistency penalties from ambient covariance Λ, which is weaker than H¹ yet matches its generalization rate up to logs.

Load-bearing premise

The ambient covariance encodes coordinate-invariant tangent-space information whose range spans the tangent bundle, so penalties remain effective for imperfect charts.

What would settle it

Finding a case where the W^{2,∞} chart-convergence assumption holds yet the weak convergence of ambient dynamics or radial MFPT convergence fails would falsify the propagation claim.

Figures

Figures reproduced from arXiv: 2604.16282 by Felix X.-F. Ye, Sean Hill.

read the original abstract

Stochastic dynamical systems with slow or metastable behavior evolve, on long time scales, on an unknown low-dimensional manifold in high-dimensional ambient space. Building a reduced simulator from short-burst ambient ensembles is a long-standing problem: local-chart methods like ATLAS suffer from exponential landmark scaling and per-step reprojection, while autoencoder alternatives leave tangent-bundle geometry poorly constrained, and the errors propagate into the learned drift and diffusion. We observe that the ambient covariance~$\Lambda$ already encodes coordinate-invariant tangent-space information, its range spanning the tangent bundle. Using this, we construct a tangent-bundle penalty and an inverse-consistency penalty for a three-stage pipeline (chart learning, latent drift, latent diffusion) that learns a single nonlinear chart and the latent SDE. The penalties induce a function-space metric, the $\rho$-metric, strictly weaker than the Sobolev $H^1$ norm yet achieving the same chart-quality generalization rate up to logarithmic factors. For the drift, we derive an encoder-pullback target via It\^o's formula on the learned encoder and prove a bias decomposition showing the standard decoder-side formula carries systematic error for any imperfect chart. Under a $W^{2,\infty}$ chart-convergence assumption, chart-level error propagates controllably to weak convergence of the ambient dynamics and to convergence of radial mean first-passage times. Experiments on four surfaces embedded in up to $201$ ambient dimensions reduce radial MFPT error by $50$--$70\%$ under rotation dynamics and achieve the lowest inter-well MFPT error on most surface--transition pairs under metastable M\"uller--Brown Langevin dynamics, while reducing end-to-end ambient coefficient errors by up to an order of magnitude relative to an unregularized autoencoder.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds covariance-based tangent penalties and a bias-corrected Ito pullback to autoencoder pipelines for latent SDEs, with reported MFPT gains, but the main propagation theorem rests on an unverified W^{2,∞} assumption.

read the letter

The core contribution is a three-stage setup that pulls tangent information from ambient covariance to build a rho-metric penalty for charts plus an inverse-consistency term, then uses an encoder-side Ito pullback with explicit bias decomposition for the latent drift. This is new relative to ATLAS-style local charts and plain autoencoders. The experiments on four embedded surfaces up to 201 dimensions show 50-70% drops in radial MFPT error under rotation and order-of-magnitude cuts in ambient coefficient error versus the unregularized baseline, plus lowest inter-well MFPT on most Müller-Brown pairs. Those numbers are concrete and worth noting for anyone building reduced stochastic simulators from ambient bursts.

Referee Report

2 major / 2 minor

Summary. The paper proposes a three-stage pipeline for learning a nonlinear chart and latent SDE from high-dimensional ambient stochastic dynamics. It constructs tangent-bundle and inverse-consistency penalties from the observed ambient covariance Λ, introduces a ρ-metric that is weaker than H¹ yet achieves comparable generalization rates up to log factors, derives an Itô pullback target for the latent drift together with a bias decomposition for imperfect charts, and proves that chart error propagates controllably to weak convergence of the ambient dynamics and to radial MFPT convergence under a W^{2,∞} chart-convergence assumption. Experiments on four embedded surfaces (up to 201 ambient dimensions) report 50–70 % reductions in radial MFPT error under rotation dynamics, lowest inter-well MFPT error on most Müller–Brown pairs, and up to an order-of-magnitude improvement in end-to-end ambient coefficient accuracy relative to an unregularized autoencoder.

Significance. If the W^{2,∞} assumption holds in practice and the bias decomposition is tight, the work supplies a coordinate-invariant geometric regularizer that directly constrains the tangent bundle and mitigates error propagation into learned drift and diffusion—addressing a recognized limitation of standard autoencoders for SDE manifold learning. The explicit Itô-derived bias decomposition and the controlled propagation result to MFPTs are technically substantive contributions. The reported quantitative gains (50–70 % MFPT error reduction, order-of-magnitude coefficient improvement) suggest practical value for reduced-order modeling of metastable systems, provided the theoretical mechanism can be linked to the observed performance.

major comments (2)

[Abstract / Theoretical Results] Abstract and theoretical development: the claim that chart-level error propagates controllably to weak convergence of the ambient dynamics and to convergence of radial mean first-passage times is established only under the W^{2,∞} chart-convergence assumption. The ρ-metric regularization is stated to be strictly weaker than H¹ and to control first-order terms only up to logarithmic factors, supplying no uniform bound on second derivatives. Consequently the experimental improvements cannot yet be attributed to the proven propagation mechanism rather than incidental regularization effects.
[Experiments] Experiments section: no diagnostics are reported that confirm the learned charts satisfy the W^{2,∞} assumption required for the propagation guarantees (e.g., sup-norm of Hessian error, second-derivative convergence plots, or comparison against the assumed rate). Without such verification the 50–70 % radial MFPT error reduction and order-of-magnitude ambient-coefficient improvement cannot be confidently linked to the theoretical result.

minor comments (2)

[Method] The definition and properties of the ρ-metric should be stated formally in the main text (with explicit comparison to H¹) rather than only summarized in the abstract.
[Introduction / Preliminaries] Notation for the ambient covariance Λ and its range spanning the tangent bundle could be introduced with a short lemma or remark to make the coordinate-invariance claim self-contained.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. We address the two major comments point by point below, indicating the revisions we will incorporate.

read point-by-point responses

Referee: [Abstract / Theoretical Results] Abstract and theoretical development: the claim that chart-level error propagates controllably to weak convergence of the ambient dynamics and to convergence of radial mean first-passage times is established only under the W^{2,∞} chart-convergence assumption. The ρ-metric regularization is stated to be strictly weaker than H¹ and to control first-order terms only up to logarithmic factors, supplying no uniform bound on second derivatives. Consequently the experimental improvements cannot yet be attributed to the proven propagation mechanism rather than incidental regularization effects.

Authors: We agree that the propagation guarantees for weak convergence of the ambient dynamics and for radial MFPT convergence are proven only under the W^{2,∞} chart-convergence assumption, and that the ρ-metric is strictly weaker than H¹ with control on first-order terms up to logarithmic factors but without a uniform bound on second derivatives. We will revise the abstract and the theoretical sections to state these assumptions more explicitly and to clarify that the experimental gains are not claimed to be a direct verification of the propagation theorem. At the same time, the tangent-bundle penalty derived from observed covariance Λ is coordinate-invariant and directly constrains the geometry that enters the Itô pullback and bias decomposition; the reported 50–70 % MFPT reductions and order-of-magnitude coefficient improvements therefore remain evidence of the practical utility of the regularizer even if the full W^{2,∞} rate is not yet verified. revision: partial
Referee: [Experiments] Experiments section: no diagnostics are reported that confirm the learned charts satisfy the W^{2,∞} assumption required for the propagation guarantees (e.g., sup-norm of Hessian error, second-derivative convergence plots, or comparison against the assumed rate). Without such verification the 50–70 % radial MFPT error reduction and order-of-magnitude ambient-coefficient improvement cannot be confidently linked to the theoretical result.

Authors: We accept the observation that the current experiments section lacks explicit diagnostics for the W^{2,∞} assumption. In the revised manuscript we will add, for the four synthetic embedded surfaces, (i) estimates of the sup-norm of the Hessian error between the learned chart and the ground-truth embedding and (ii) second-derivative convergence plots with respect to regularization strength. Because the manifolds are known, these quantities are computable from the encoder and decoder Jacobians and Hessians. The added diagnostics will allow readers to assess how closely the learned charts approach the assumption and will strengthen the link between the observed performance gains and the geometric regularization. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation uses external covariance property and standard Itô application under explicit assumption

full rationale

The paper's core steps derive the tangent-bundle penalty directly from the ambient covariance Λ (an observed property of the stochastic dynamics, independent of the learned chart) and apply Itô's formula to obtain the encoder-pullback target for the drift, followed by an explicit bias decomposition. The propagation guarantee to weak convergence and MFPT convergence is stated conditionally on the W^{2,∞} chart-convergence assumption rather than derived from the regularization itself. No step renames a fitted quantity as a prediction, no self-citation is load-bearing for the central claims, and the ρ-metric is constructed from first principles as weaker than H¹ yet rate-equivalent up to logs. Experiments report empirical error reductions without claiming they close the theoretical loop or verify the assumption. The chain therefore remains self-contained against external dynamical properties.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities can be identified. The method likely introduces regularization hyperparameters and relies on standard assumptions from stochastic analysis and manifold learning, but details are unavailable.

pith-pipeline@v0.9.0 · 5617 in / 1416 out tokens · 41343 ms · 2026-05-10T09:07:11.861354+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages

[1]

Intrinsic stochastic differential equations as jets.Pro- ceedings of the Royal Society A, 474(2210):20170559, 2018

John Armstrong and Damiano Brigo. Intrinsic stochastic differential equations as jets.Pro- ceedings of the Royal Society A, 474(2210):20170559, 2018

work page 2018
[2]

Projections of SDEs onto submanifolds

John Armstrong, Damiano Brigo, and Emilio Ferrucci. Projections of SDEs onto submanifolds. Information Geometry, 7(Suppl 1):397–427, 2024

work page 2024
[3]

Latent space oddity: on the curvature of deep generative models

Georgios Arvanitidis, Lars Kai Hansen, and Søren Hauberg. Latent space oddity: on the curvature of deep generative models. InInternational Conference on Learning Representations (ICLR), 2018

work page 2018
[4]

Bartlett, Olivier Bousquet, and Shahar Mendelson

Peter L. Bartlett, Olivier Bousquet, and Shahar Mendelson. Local Rademacher complexities. The Annals of Statistics, 33(4):1497–1537, 2005

work page 2005
[5]

Probability and its Applications

Nils Berglund and Barbara Gentz.Noise-Induced Phenomena in Slow-Fast Dynamical Systems: A Sample-Paths Approach. Probability and its Applications. Springer, 2006

work page 2006
[6]

Kevrekidis

Tom Bertalan, Felix Dietrich, Igor Mezi´ c, and Ioannis G. Kevrekidis. On learning Hamiltonian systems from data.Chaos, 29(12):121107, 2019

work page 2019
[7]

Transition manifolds of complex metastable systems: Theory and data-driven com- putation of effective dynamics.J

Andreas Bittracher, P´ eter Koltai, Stefan Klus, Ralf Banisch, Michael Dellnitz, and Christof Sch¨ utte. Transition manifolds of complex metastable systems: Theory and data-driven com- putation of effective dynamics.J. Nonlinear Sci., 28(2):471–512, 2018. 26 Table 6: Full ablation under MB Langevin (N=200, 10 seeds, medians).Bold= best per column. D=11D=2...

work page 2018
[8]

Data-driven discovery of coordinates and governing equations.Proceedings of the National Academy of Sciences, 116(45):22445–22451, 2019

Kathleen Champion, Bethany Lusch, J Nathan Kutz, and Steven L Brunton. Data-driven discovery of coordinates and governing equations.Proceedings of the National Academy of Sciences, 116(45):22445–22451, 2019

work page 2019
[9]

Chirikjian.Stochastic Models, Information Theory, and Lie Groups, Volume 1: Classi- cal Results and Geometric Methods

G.S. Chirikjian.Stochastic Models, Information Theory, and Lie Groups, Volume 1: Classi- cal Results and Geometric Methods. Applied and Numerical Harmonic Analysis. Birkh¨ auser Boston, 2009

work page 2009
[10]

Coifman, Ioannis G

Ronald R. Coifman, Ioannis G. Kevrekidis, St´ ephane Lafon, Mauro Maggioni, and Boaz Nadler. Diffusion maps, reduction coordinates, and low dimensional representation of stochas- tic systems.Multiscale Modeling & Simulation, 7(2):842–864, 2008

work page 2008
[11]

Coifman and St´ ephane Lafon

Ronald R. Coifman and St´ ephane Lafon. Diffusion maps.Applied and Computational Har- monic Analysis, 21(1):5–30, 2006

work page 2006
[12]

ATLAS: a geometric approach to learning high- dimensional stochastic systems near manifolds.Multiscale Model

Miles Crosskey and Mauro Maggioni. ATLAS: a geometric approach to learning high- dimensional stochastic systems near manifolds.Multiscale Model. Simul., 15(1):110–156, 2017

work page 2017
[13]

Riemannian score-based generative modelling

Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, and Arnaud Doucet. Riemannian score-based generative modelling. InAdvances in Neural Information Processing Systems, volume 35, pages 2406–2422, 2022

work page 2022
[14]

Kevrekidis

Felix Dietrich, Alexei Makeev, George Kevrekidis, Nikolaos Evangelou, Tom Bertalan, Sebas- tian Reich, and Ioannis G. Kevrekidis. Learning effective stochastic differential equations from 27 microscopic simulations: Linking stochastic numerics to deep learning.Chaos, 33(2):023121, 2023

work page 2023
[15]

Kevrekidis

Nikolaos Evangelou, Felix Dietrich, Eliodoro Chiavazzo, Daniel Lehmberg, Marina Meila, and Ioannis G. Kevrekidis. Double diffusion maps and their latent harmonics for scientific compu- tations in latent space.Journal of Computational Physics, 485:112072, 2023

work page 2023
[16]

Data-driven discovery of intrinsic dynamics.Nature Machine Intelligence, 4(12):1113–1120, 2022

Daniel Floryan and Michael D Graham. Data-driven discovery of intrinsic dynamics.Nature Machine Intelligence, 4(12):1113–1120, 2022

work page 2022
[17]

ICON: Learn- ing regular maps through inverse consistency

Hastings Greer, Roland Kwitt, Fran¸ cois-Xavier Vialard, and Marc Niethammer. ICON: Learn- ing regular maps through inverse consistency. InProc. IEEE/CVF Intl. Conf. Computer Vision (ICCV), pages 3396–3405, 2021

work page 2021
[18]

Springer Series in Statistics

L´ aszl´ o Gy¨ orfi, Michael Kohler, Adam Krzy˙ zak, and Harro Walk.A Distribution-Free Theory of Nonparametric Regression. Springer Series in Statistics. Springer, 2002

work page 2002
[19]

Reaction-rate theory: fifty years after Kramers.Rev

Peter H¨ anggi, Peter Talkner, and Michal Borkovec. Reaction-rate theory: fifty years after Kramers.Rev. Mod. Phys., 62(2):251–341, 1990

work page 1990
[20]

Pereira, Sina Farsiu, and Vahid Tarokh

Ali Hasan, Jo˜ ao M. Pereira, Sina Farsiu, and Vahid Tarokh. Identifying latent stochastic differential equations.IEEE Transactions on Signal Processing, 70:89–104, 2022

work page 2022
[21]

Horn and Charles R

Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, 2 edition, 2013

work page 2013
[22]

Hsu.Stochastic Analysis on Manifolds, volume 38 ofGraduate Studies in Mathematics

Elton P. Hsu.Stochastic Analysis on Manifolds, volume 38 ofGraduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2002

work page 2002
[23]

Riemannian diffusion models

Chin-Wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, and Aaron Courville. Riemannian diffusion models. InAdvances in Neural Information Processing Sys- tems, volume 35, 2022

work page 2022
[24]

Thinner latent spaces: Detecting dimension and imposing invariance through autoencoder gradient constraints.arXiv preprint arXiv:2408.16138, 2024

George A Kevrekidis, Mauro Maggioni, Soledad Villar, and Yannis G Kevrekidis. Thinner latent spaces: Detecting dimension and imposing invariance through autoencoder gradient constraints.arXiv preprint arXiv:2408.16138, 2024

work page arXiv 2024
[25]

Kevrekidis, C

Ioannis G. Kevrekidis, C. William Gear, James M. Hyman, Panagiotis G. Kevrekidis, Olof Runborg, and Constantinos Theodoropoulos. Equation-free, coarse-grained multiscale compu- tation: Enabling microscopic simulators to perform system-level analysis.Communications in Mathematical Sciences, 1(4):715–762, 2003

work page 2003
[26]

Data-driven model reduction and transfer operator approximation.J

Stefan Klus, Feliks N¨ uske, P´ eter Koltai, Hao Wu, Ioannis Kevrekidis, Christof Sch¨ utte, and Frank No´ e. Data-driven model reduction and transfer operator approximation.J. Nonlinear Sci., 28:985–1010, 2018

work page 2018
[27]

H. A. Kramers. Brownian motion in a field of force and the diffusion model of chemical reactions.Physica, 7(4):284–304, 1940

work page 1940
[28]

Lee.Introduction to Smooth Manifolds, volume 218 ofGraduate Texts in Mathematics

John M. Lee.Introduction to Smooth Manifolds, volume 218 ofGraduate Texts in Mathematics. Springer, 2nd edition, 2012. 28

work page 2012
[29]

Model reduction of dynamical systems on nonlinear man- ifolds using deep convolutional autoencoders.Journal of Computational Physics, 404:108973, 2020

Kookjin Lee and Kevin T Carlberg. Model reduction of dynamical systems on nonlinear man- ifolds using deep convolutional autoencoders.Journal of Computational Physics, 404:108973, 2020

work page 2020
[30]

Springer, 2015

Ben Leimkuhler and Charles Matthews.Molecular Dynamics: With Deterministic and Stochastic Numerical Methods, volume 39 ofInterdisciplinary Applied Mathematics. Springer, 2015

work page 2015
[31]

Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, and David Duvenaud. Scalable gra- dients for stochastic differential equations. InProceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), volume 108 ofPMLR, pages 3870–3882, 2020

work page 2020
[32]

Linot and Michael D

Alec J. Linot and Michael D. Graham. Deep learning to discover and predict dynamics on an inertial manifold.Physical Review E, 101(6):062209, 2020

work page 2020
[33]

Deep nonparametric estimation of intrinsic data structures by chart autoencoders: Generalization error and robustness.Applied and Computational Harmonic Analysis, 68:101602, 2024

Hao Liu, Alex Havrilla, Rongjie Lai, and Wenjing Liao. Deep nonparametric estimation of intrinsic data structures by chart autoencoders: Generalization error and robustness.Applied and Computational Harmonic Analysis, 68:101602, 2024

work page 2024
[34]

Nathan Kutz, and Steven L

Bethany Lusch, J. Nathan Kutz, and Steven L. Brunton. Deep learning for universal linear embeddings of nonlinear dynamics.Nature Communications, 9(1):4950, 2018

work page 2018
[35]

Riemannian continuous normalizing flows

Emile Mathieu and Maximilian Nickel. Riemannian continuous normalizing flows. InAdvances in Neural Information Processing Systems, volume 33, pages 2503–2515, 2020

work page 2020
[36]

M¨ uller and L

K. M¨ uller and L. D. Brown. Location of saddle points and minimum energy paths by a constrained simplex optimization procedure.Theoret. Chim. Acta, 53:75–93, 1979

work page 1979
[37]

Linearly recurrent autoencoder networks for learning dynamics.SIAM Journal on Applied Dynamical Systems, 18(1):558–593, 2019

Samuel E Otto and Clarence W Rowley. Linearly recurrent autoencoder networks for learning dynamics.SIAM Journal on Applied Dynamical Systems, 18(1):558–593, 2019

work page 2019
[38]

Pavliotis and Andrew M

Grigorios A. Pavliotis and Andrew M. Stuart.Multiscale Methods: Averaging and Homoge- nization, volume 53 ofTexts in Applied Mathematics. Springer, 2008

work page 2008
[39]

Local conformal autoencoder for standardized data coor- dinates.Proceedings of the National Academy of Sciences, 117(49):30918–30927, 2020

Erez Peterfreund, Ofir Lindenbaum, Felix Dietrich, Tom Bertalan, Matan Gavish, Ioannis G Kevrekidis, and Ronald R Coifman. Local conformal autoencoder for standardized data coor- dinates.Proceedings of the National Academy of Sciences, 117(49):30918–30927, 2020

work page 2020
[40]

Contractive auto-encoders: explicit invariance during feature extraction

Salah Rifai, Pascal Vincent, Xavier Muller, Xavier Glorot, and Yoshua Bengio. Contractive auto-encoders: explicit invariance during feature extraction. InProceedings of the 28th In- ternational Conference on Machine Learning (ICML), ICML’11, page 833–840, Madison, WI, USA, 2011. Omnipress

work page 2011
[41]

L. C. G. Rogers and David Williams.Diffusions, Markov Processes and Martingales: Volume 2, Itˆ o Calculus. Cambridge Mathematical Library. Cambridge University Press, 2 edition, 2000

work page 2000
[42]

Chart auto-encoders for manifold structured data, 2019

Stefan Schonsheck, Jie Chen, and Rongjie Lai. Chart auto-encoders for manifold structured data, 2019

work page 2019
[43]

Schonsheck, Scott Mahan, Timo Klock, Alexander Cloninger, and Rongjie Lai

Stefan C. Schonsheck, Scott Mahan, Timo Klock, Alexander Cloninger, and Rongjie Lai. Semi-supervised manifold learning with complexity decoupled chart autoencoders, 2022. 29

work page 2022
[44]

Stroock and S

Daniel W. Stroock and S. R. Srinivasa Varadhan.Multidimensional Diffusion Processes, vol- ume 233 ofGrundlehren der mathematischen Wissenschaften. Springer-Verlag, Berlin, 1979

work page 1979
[45]

Springer, New York, 2002

Ward Whitt.Stochastic-Process Limits: An Introduction to Stochastic-Process Limits and Their Application to Queues. Springer, New York, 2002

work page 2002
[46]

Deeper or wider: A perspective from optimal generalization error with sobolev loss

Yahong Yang and Juncai He. Deeper or wider: A perspective from optimal generalization error with sobolev loss. InProceedings of the 41st International Conference on Machine Learning (ICML), volume 235 ofPMLR, pages 56109–56138, 2024

work page 2024
[47]

Deep neural networks with general activations: Super- convergence in sobolev norms, 2025

Yahong Yang and Juncai He. Deep neural networks with general activations: Super- convergence in sobolev norms, 2025

work page 2025
[48]

Ye, Sichen Yang, and Mauro Maggioni

Felix X.-F. Ye, Sichen Yang, and Mauro Maggioni. Nonlinear model reduction for slow–fast stochastic systems near unknown invariant manifolds.Journal of Nonlinear Science, 34(1):22, 2024. 30

work page 2024

[1] [1]

Intrinsic stochastic differential equations as jets.Pro- ceedings of the Royal Society A, 474(2210):20170559, 2018

John Armstrong and Damiano Brigo. Intrinsic stochastic differential equations as jets.Pro- ceedings of the Royal Society A, 474(2210):20170559, 2018

work page 2018

[2] [2]

Projections of SDEs onto submanifolds

John Armstrong, Damiano Brigo, and Emilio Ferrucci. Projections of SDEs onto submanifolds. Information Geometry, 7(Suppl 1):397–427, 2024

work page 2024

[3] [3]

Latent space oddity: on the curvature of deep generative models

Georgios Arvanitidis, Lars Kai Hansen, and Søren Hauberg. Latent space oddity: on the curvature of deep generative models. InInternational Conference on Learning Representations (ICLR), 2018

work page 2018

[4] [4]

Bartlett, Olivier Bousquet, and Shahar Mendelson

Peter L. Bartlett, Olivier Bousquet, and Shahar Mendelson. Local Rademacher complexities. The Annals of Statistics, 33(4):1497–1537, 2005

work page 2005

[5] [5]

Probability and its Applications

Nils Berglund and Barbara Gentz.Noise-Induced Phenomena in Slow-Fast Dynamical Systems: A Sample-Paths Approach. Probability and its Applications. Springer, 2006

work page 2006

[6] [6]

Kevrekidis

Tom Bertalan, Felix Dietrich, Igor Mezi´ c, and Ioannis G. Kevrekidis. On learning Hamiltonian systems from data.Chaos, 29(12):121107, 2019

work page 2019

[7] [7]

Transition manifolds of complex metastable systems: Theory and data-driven com- putation of effective dynamics.J

Andreas Bittracher, P´ eter Koltai, Stefan Klus, Ralf Banisch, Michael Dellnitz, and Christof Sch¨ utte. Transition manifolds of complex metastable systems: Theory and data-driven com- putation of effective dynamics.J. Nonlinear Sci., 28(2):471–512, 2018. 26 Table 6: Full ablation under MB Langevin (N=200, 10 seeds, medians).Bold= best per column. D=11D=2...

work page 2018

[8] [8]

Data-driven discovery of coordinates and governing equations.Proceedings of the National Academy of Sciences, 116(45):22445–22451, 2019

Kathleen Champion, Bethany Lusch, J Nathan Kutz, and Steven L Brunton. Data-driven discovery of coordinates and governing equations.Proceedings of the National Academy of Sciences, 116(45):22445–22451, 2019

work page 2019

[9] [9]

Chirikjian.Stochastic Models, Information Theory, and Lie Groups, Volume 1: Classi- cal Results and Geometric Methods

G.S. Chirikjian.Stochastic Models, Information Theory, and Lie Groups, Volume 1: Classi- cal Results and Geometric Methods. Applied and Numerical Harmonic Analysis. Birkh¨ auser Boston, 2009

work page 2009

[10] [10]

Coifman, Ioannis G

Ronald R. Coifman, Ioannis G. Kevrekidis, St´ ephane Lafon, Mauro Maggioni, and Boaz Nadler. Diffusion maps, reduction coordinates, and low dimensional representation of stochas- tic systems.Multiscale Modeling & Simulation, 7(2):842–864, 2008

work page 2008

[11] [11]

Coifman and St´ ephane Lafon

Ronald R. Coifman and St´ ephane Lafon. Diffusion maps.Applied and Computational Har- monic Analysis, 21(1):5–30, 2006

work page 2006

[12] [12]

ATLAS: a geometric approach to learning high- dimensional stochastic systems near manifolds.Multiscale Model

Miles Crosskey and Mauro Maggioni. ATLAS: a geometric approach to learning high- dimensional stochastic systems near manifolds.Multiscale Model. Simul., 15(1):110–156, 2017

work page 2017

[13] [13]

Riemannian score-based generative modelling

Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, and Arnaud Doucet. Riemannian score-based generative modelling. InAdvances in Neural Information Processing Systems, volume 35, pages 2406–2422, 2022

work page 2022

[14] [14]

Kevrekidis

Felix Dietrich, Alexei Makeev, George Kevrekidis, Nikolaos Evangelou, Tom Bertalan, Sebas- tian Reich, and Ioannis G. Kevrekidis. Learning effective stochastic differential equations from 27 microscopic simulations: Linking stochastic numerics to deep learning.Chaos, 33(2):023121, 2023

work page 2023

[15] [15]

Kevrekidis

Nikolaos Evangelou, Felix Dietrich, Eliodoro Chiavazzo, Daniel Lehmberg, Marina Meila, and Ioannis G. Kevrekidis. Double diffusion maps and their latent harmonics for scientific compu- tations in latent space.Journal of Computational Physics, 485:112072, 2023

work page 2023

[16] [16]

Data-driven discovery of intrinsic dynamics.Nature Machine Intelligence, 4(12):1113–1120, 2022

Daniel Floryan and Michael D Graham. Data-driven discovery of intrinsic dynamics.Nature Machine Intelligence, 4(12):1113–1120, 2022

work page 2022

[17] [17]

ICON: Learn- ing regular maps through inverse consistency

Hastings Greer, Roland Kwitt, Fran¸ cois-Xavier Vialard, and Marc Niethammer. ICON: Learn- ing regular maps through inverse consistency. InProc. IEEE/CVF Intl. Conf. Computer Vision (ICCV), pages 3396–3405, 2021

work page 2021

[18] [18]

Springer Series in Statistics

L´ aszl´ o Gy¨ orfi, Michael Kohler, Adam Krzy˙ zak, and Harro Walk.A Distribution-Free Theory of Nonparametric Regression. Springer Series in Statistics. Springer, 2002

work page 2002

[19] [19]

Reaction-rate theory: fifty years after Kramers.Rev

Peter H¨ anggi, Peter Talkner, and Michal Borkovec. Reaction-rate theory: fifty years after Kramers.Rev. Mod. Phys., 62(2):251–341, 1990

work page 1990

[20] [20]

Pereira, Sina Farsiu, and Vahid Tarokh

Ali Hasan, Jo˜ ao M. Pereira, Sina Farsiu, and Vahid Tarokh. Identifying latent stochastic differential equations.IEEE Transactions on Signal Processing, 70:89–104, 2022

work page 2022

[21] [21]

Horn and Charles R

Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, 2 edition, 2013

work page 2013

[22] [22]

Hsu.Stochastic Analysis on Manifolds, volume 38 ofGraduate Studies in Mathematics

Elton P. Hsu.Stochastic Analysis on Manifolds, volume 38 ofGraduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2002

work page 2002

[23] [23]

Riemannian diffusion models

Chin-Wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, and Aaron Courville. Riemannian diffusion models. InAdvances in Neural Information Processing Sys- tems, volume 35, 2022

work page 2022

[24] [24]

Thinner latent spaces: Detecting dimension and imposing invariance through autoencoder gradient constraints.arXiv preprint arXiv:2408.16138, 2024

George A Kevrekidis, Mauro Maggioni, Soledad Villar, and Yannis G Kevrekidis. Thinner latent spaces: Detecting dimension and imposing invariance through autoencoder gradient constraints.arXiv preprint arXiv:2408.16138, 2024

work page arXiv 2024

[25] [25]

Kevrekidis, C

Ioannis G. Kevrekidis, C. William Gear, James M. Hyman, Panagiotis G. Kevrekidis, Olof Runborg, and Constantinos Theodoropoulos. Equation-free, coarse-grained multiscale compu- tation: Enabling microscopic simulators to perform system-level analysis.Communications in Mathematical Sciences, 1(4):715–762, 2003

work page 2003

[26] [26]

Data-driven model reduction and transfer operator approximation.J

Stefan Klus, Feliks N¨ uske, P´ eter Koltai, Hao Wu, Ioannis Kevrekidis, Christof Sch¨ utte, and Frank No´ e. Data-driven model reduction and transfer operator approximation.J. Nonlinear Sci., 28:985–1010, 2018

work page 2018

[27] [27]

H. A. Kramers. Brownian motion in a field of force and the diffusion model of chemical reactions.Physica, 7(4):284–304, 1940

work page 1940

[28] [28]

Lee.Introduction to Smooth Manifolds, volume 218 ofGraduate Texts in Mathematics

John M. Lee.Introduction to Smooth Manifolds, volume 218 ofGraduate Texts in Mathematics. Springer, 2nd edition, 2012. 28

work page 2012

[29] [29]

Model reduction of dynamical systems on nonlinear man- ifolds using deep convolutional autoencoders.Journal of Computational Physics, 404:108973, 2020

Kookjin Lee and Kevin T Carlberg. Model reduction of dynamical systems on nonlinear man- ifolds using deep convolutional autoencoders.Journal of Computational Physics, 404:108973, 2020

work page 2020

[30] [30]

Springer, 2015

Ben Leimkuhler and Charles Matthews.Molecular Dynamics: With Deterministic and Stochastic Numerical Methods, volume 39 ofInterdisciplinary Applied Mathematics. Springer, 2015

work page 2015

[31] [31]

Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, and David Duvenaud. Scalable gra- dients for stochastic differential equations. InProceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), volume 108 ofPMLR, pages 3870–3882, 2020

work page 2020

[32] [32]

Linot and Michael D

Alec J. Linot and Michael D. Graham. Deep learning to discover and predict dynamics on an inertial manifold.Physical Review E, 101(6):062209, 2020

work page 2020

[33] [33]

Deep nonparametric estimation of intrinsic data structures by chart autoencoders: Generalization error and robustness.Applied and Computational Harmonic Analysis, 68:101602, 2024

Hao Liu, Alex Havrilla, Rongjie Lai, and Wenjing Liao. Deep nonparametric estimation of intrinsic data structures by chart autoencoders: Generalization error and robustness.Applied and Computational Harmonic Analysis, 68:101602, 2024

work page 2024

[34] [34]

Nathan Kutz, and Steven L

Bethany Lusch, J. Nathan Kutz, and Steven L. Brunton. Deep learning for universal linear embeddings of nonlinear dynamics.Nature Communications, 9(1):4950, 2018

work page 2018

[35] [35]

Riemannian continuous normalizing flows

Emile Mathieu and Maximilian Nickel. Riemannian continuous normalizing flows. InAdvances in Neural Information Processing Systems, volume 33, pages 2503–2515, 2020

work page 2020

[36] [36]

M¨ uller and L

K. M¨ uller and L. D. Brown. Location of saddle points and minimum energy paths by a constrained simplex optimization procedure.Theoret. Chim. Acta, 53:75–93, 1979

work page 1979

[37] [37]

Linearly recurrent autoencoder networks for learning dynamics.SIAM Journal on Applied Dynamical Systems, 18(1):558–593, 2019

Samuel E Otto and Clarence W Rowley. Linearly recurrent autoencoder networks for learning dynamics.SIAM Journal on Applied Dynamical Systems, 18(1):558–593, 2019

work page 2019

[38] [38]

Pavliotis and Andrew M

Grigorios A. Pavliotis and Andrew M. Stuart.Multiscale Methods: Averaging and Homoge- nization, volume 53 ofTexts in Applied Mathematics. Springer, 2008

work page 2008

[39] [39]

Local conformal autoencoder for standardized data coor- dinates.Proceedings of the National Academy of Sciences, 117(49):30918–30927, 2020

Erez Peterfreund, Ofir Lindenbaum, Felix Dietrich, Tom Bertalan, Matan Gavish, Ioannis G Kevrekidis, and Ronald R Coifman. Local conformal autoencoder for standardized data coor- dinates.Proceedings of the National Academy of Sciences, 117(49):30918–30927, 2020

work page 2020

[40] [40]

Contractive auto-encoders: explicit invariance during feature extraction

Salah Rifai, Pascal Vincent, Xavier Muller, Xavier Glorot, and Yoshua Bengio. Contractive auto-encoders: explicit invariance during feature extraction. InProceedings of the 28th In- ternational Conference on Machine Learning (ICML), ICML’11, page 833–840, Madison, WI, USA, 2011. Omnipress

work page 2011

[41] [41]

L. C. G. Rogers and David Williams.Diffusions, Markov Processes and Martingales: Volume 2, Itˆ o Calculus. Cambridge Mathematical Library. Cambridge University Press, 2 edition, 2000

work page 2000

[42] [42]

Chart auto-encoders for manifold structured data, 2019

Stefan Schonsheck, Jie Chen, and Rongjie Lai. Chart auto-encoders for manifold structured data, 2019

work page 2019

[43] [43]

Schonsheck, Scott Mahan, Timo Klock, Alexander Cloninger, and Rongjie Lai

Stefan C. Schonsheck, Scott Mahan, Timo Klock, Alexander Cloninger, and Rongjie Lai. Semi-supervised manifold learning with complexity decoupled chart autoencoders, 2022. 29

work page 2022

[44] [44]

Stroock and S

Daniel W. Stroock and S. R. Srinivasa Varadhan.Multidimensional Diffusion Processes, vol- ume 233 ofGrundlehren der mathematischen Wissenschaften. Springer-Verlag, Berlin, 1979

work page 1979

[45] [45]

Springer, New York, 2002

Ward Whitt.Stochastic-Process Limits: An Introduction to Stochastic-Process Limits and Their Application to Queues. Springer, New York, 2002

work page 2002

[46] [46]

Deeper or wider: A perspective from optimal generalization error with sobolev loss

Yahong Yang and Juncai He. Deeper or wider: A perspective from optimal generalization error with sobolev loss. InProceedings of the 41st International Conference on Machine Learning (ICML), volume 235 ofPMLR, pages 56109–56138, 2024

work page 2024

[47] [47]

Deep neural networks with general activations: Super- convergence in sobolev norms, 2025

Yahong Yang and Juncai He. Deep neural networks with general activations: Super- convergence in sobolev norms, 2025

work page 2025

[48] [48]

Ye, Sichen Yang, and Mauro Maggioni

Felix X.-F. Ye, Sichen Yang, and Mauro Maggioni. Nonlinear model reduction for slow–fast stochastic systems near unknown invariant manifolds.Journal of Nonlinear Science, 34(1):22, 2024. 30

work page 2024