Linearly-scalable and entropy-optimal learning of nonstationary and nonlinear manifolds
Pith reviewed 2026-05-16 23:45 UTC · model grok-4.3
The pith
Entropy-optimal manifold clustering identifies metastable regime switches in nonlinear chaotic systems with linear scalability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
EOMC mitigates cost scaling and robustness issues of existing tools while maintaining O(T) iteration complexity for data size T and allowing explicit computation of input data reliability. Application to Lorenz-96 and modified Hasegawa-Wakatani models shows their essential dynamics as a metastable regime-switching process with infrequent transitions between very persistent low-dimensional manifolds. The Markovian mean exit times and relaxation times decrease only very slowly with growing external forcing, indicating approximately two-fold longer prediction horizons than anticipated from Lyapunov exponents.
What carries the argument
Entropy-Optimal Manifold Clustering (EOMC), a method that optimizes entropy to cluster data into metastable low-dimensional manifolds in nonstationary nonlinear settings.
Load-bearing premise
That the entropy-optimal clustering procedure identifies the true metastable manifolds and transition statistics without requiring post-hoc parameter tuning or being overly sensitive to noise in the data.
What would settle it
Observing that the prediction horizons computed from the identified regime-switching process do not exceed those from standard Lyapunov exponent analysis in controlled numerical experiments on the Lorenz-96 system.
read the original abstract
We propose an Entropy-Optimal Manifold Clustering (EOMC) - and show that it mitigates the cost scaling and robustness issues of the existing dimensionality reduction and manifold learning tools in nonstationary and nonlinear situations, while pertaining the favourable O(T) iteration complexity scaling in the statistics size T, and allowing explicit computation of input data reliability. Application to the Lorenz-96 dynamical system in chaotic regime, as well as to a modified Hasegawa-Wakatani (mHW) model of drift-wave turbulence in the edge of a tokamak plasma reveals that for both of the models their essential dynamics is best described as a metastable regime-switching process, making infrequent transitions between the very persistent low-dimensional manifolds. At the same time, the Markovian mean exit times and relaxation times (that bound the predictability horizons for the identified regime-switching process) appear to decrease only very slowly with the growing external forcing - indicating approximately two-fold longer prediction horizons then is currently anticipated based on analysis of positive Lyapunov exponents, even in very chaotic model regimes. It is also demonstrated that when applied for a lossy compression of the Lorenz-96 and mHW output data in various forcing regimes, EOMC achieves several orders of magnitude smaller compression loss - when compared to the common PCA-related linear compression approaches that build a backbone of the state-of-the-art lossy data compression tools (like JPEG, MP3, and others). These findings open new exciting opportunities for EOMC and transfer operator theory, by offering new possibilities to significantly improve predictive skills and performance of data-driven tools in fluid mechanics and geosciences applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Entropy-Optimal Manifold Clustering (EOMC) as a dimensionality-reduction technique for nonstationary nonlinear dynamical systems. It claims O(T) iteration complexity, explicit input-data reliability scores, and superior lossy compression relative to PCA. Applications to the chaotic Lorenz-96 system and the modified Hasegawa-Wakatani (mHW) drift-wave turbulence model are used to argue that the essential dynamics consist of infrequent transitions between persistent low-dimensional metastable manifolds, with Markovian mean exit and relaxation times that decrease only slowly with external forcing and yield predictability horizons approximately twice those inferred from positive Lyapunov exponents.
Significance. If the central claims are substantiated by rigorous validation, the work would provide a scalable, entropy-based alternative to existing manifold-learning tools for high-dimensional chaotic and turbulent flows, with direct implications for data compression and extended-range prediction in fluid mechanics and plasma physics. The explicit linkage to transfer-operator spectra and metastable regime identification would strengthen the interface between data-driven methods and dynamical-systems theory.
major comments (2)
- [Applications to Lorenz-96 and mHW] Applications section (Lorenz-96 and mHW results): the assertion that EOMC recovers the physically correct metastable manifolds and their transition statistics is load-bearing for the regime-switching interpretation and the two-fold predictability-horizon claim, yet the manuscript supplies no controlled validation against known dynamical features (e.g., forcing thresholds where regimes are expected to change) or noise-robustness tests. Without such checks it remains possible that the identified clusters are method-induced artifacts rather than intrinsic to the transfer operator.
- [Predictability horizons] Predictability-horizon paragraph: the quantitative statement that mean exit and relaxation times 'decrease only very slowly' and produce 'approximately two-fold longer' horizons than Lyapunov analysis is presented without error bars, explicit numerical values, or a direct side-by-side comparison table. This absence prevents verification that the reported factor of two is not an artifact of the particular clustering threshold or data length.
minor comments (2)
- [Abstract and Methods] The abstract and results sections would benefit from a concise statement of the precise optimization objective minimized by EOMC and the definition of the reliability score; these quantities are referenced but not written explicitly.
- [Compression results] Figure captions for the compression-loss comparisons should include the exact data lengths T, the number of retained dimensions, and the precise error metric (e.g., relative L2 or Frobenius norm) so that the claimed orders-of-magnitude improvement can be reproduced.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments, which highlight important areas for strengthening the validation and quantitative presentation in the manuscript. We address each major comment below and indicate the specific revisions we will make.
read point-by-point responses
-
Referee: Applications section (Lorenz-96 and mHW results): the assertion that EOMC recovers the physically correct metastable manifolds and their transition statistics is load-bearing for the regime-switching interpretation and the two-fold predictability-horizon claim, yet the manuscript supplies no controlled validation against known dynamical features (e.g., forcing thresholds where regimes are expected to change) or noise-robustness tests. Without such checks it remains possible that the identified clusters are method-induced artifacts rather than intrinsic to the transfer operator.
Authors: We agree that additional controlled validation is necessary to substantiate that the identified manifolds reflect intrinsic dynamical features rather than artifacts. In the revised manuscript we will add a dedicated validation subsection for the Lorenz-96 system that compares detected regime transitions against documented forcing thresholds reported in the literature. We will also include noise-robustness experiments in which Gaussian noise of varying amplitude is added to the input trajectories, followed by re-clustering to quantify stability of the manifold assignments and transition statistics. These additions will directly address the concern and strengthen the regime-switching interpretation. revision: yes
-
Referee: Predictability-horizon paragraph: the quantitative statement that mean exit and relaxation times 'decrease only very slowly' and produce 'approximately two-fold longer' horizons than Lyapunov analysis is presented without error bars, explicit numerical values, or a direct side-by-side comparison table. This absence prevents verification that the reported factor of two is not an artifact of the particular clustering threshold or data length.
Authors: We acknowledge that the predictability-horizon claims require more rigorous quantitative support. In the revision we will expand the relevant paragraph to report explicit numerical values for the mean exit and relaxation times, together with standard-error bars obtained from ensemble calculations over multiple independent realizations and data lengths. A new comparison table will be inserted that places these times side-by-side with the corresponding Lyapunov-based estimates computed on identical datasets, allowing direct verification of the reported factor of approximately two and its dependence on clustering parameters. revision: yes
Circularity Check
No significant circularity; derivation remains self-contained
full rationale
The paper introduces EOMC as a proposed clustering method with claimed O(T) scaling and applies it to Lorenz-96 and mHW models to identify metastable manifolds and compute exit/relaxation times. No equations or derivation steps are provided in the available text that reduce a prediction or central result to a fitted parameter or self-citation by construction. Claims of longer predictability horizons and superior compression rest on empirical application outcomes rather than tautological redefinitions. The method's entropy optimality and manifold identification are presented as independent algorithmic contributions, with no load-bearing self-citation chains or ansatz smuggling evident. This is the expected honest non-finding for a methods paper whose core assertions are testable against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Data trajectories can be represented as infrequent switches between persistent low-dimensional manifolds
Reference graph
Works this paper leans on
-
[1]
Science290(5500), 2319–2323 (2000) 36
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science290(5500), 2319–2323 (2000) 36
work page 2000
-
[2]
Science290(5500), 2323–2326 (2000)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science290(5500), 2323–2326 (2000)
work page 2000
-
[3]
Journal of Machine Learning Research (JMLR)9, 2579–2605 (2008)
Maaten, L., Hinton, G.: Visualizing Data using t-SNE. Journal of Machine Learning Research (JMLR)9, 2579–2605 (2008)
work page 2008
-
[4]
Ma, Y., Derksen, H.: Manifold Learning Theory and Applications (2011)
work page 2011
-
[5]
Nature Biotechnology37(1), 38–44 (2019)
Becht, E., McInnes, L., Healy, J., Dutertre, C.-A., Kwok, I.W.H., Ng, L.G., Ginhoux, F., Newell, E.W.: Evaluating the manifold topology of single-cell data using UMAP. Nature Biotechnology37(1), 38–44 (2019)
work page 2019
-
[6]
Remote Sensing11(6), 681 (2019)
Sun, W., Qu, J., Sun, X., Fu, K., Meng, D., Ngan, K.: A Comparative Review of Mani- fold Learning Techniques for Hyperspectral Image Classification. Remote Sensing11(6), 681 (2019)
work page 2019
-
[7]
Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer Series in Statistics. Springer, New York, NY (2002). https://doi.org/10.1007/b98835
-
[8]
Giannakis, D., Majda, A.J.: Nonlinear laplacian spectral analysis for time series with intermittency and low-frequency variability. Proceedings of the National Academy of Sciences109(7), 2222–2227 (2012) https://doi.org/10.1073/pnas.1118984109 https://www.pnas.org/doi/pdf/10.1073/pnas.1118984109
-
[9]
Journal of Open Source Software3(29), 861 (2018) https://doi.org/10.21105/ joss.00861
McInnes, L., Healy, J., Saul, N., Großberger, L.: UMAP: Uniform Manifold Approximation and Projection. Journal of Open Source Software3(29), 861 (2018) https://doi.org/10.21105/ joss.00861
work page 2018
-
[10]
Nature Reviews Methods Primers4(1), 82 (2024)
Healy, J., McInnes, L.: Uniform manifold approximation and projection. Nature Reviews Methods Primers4(1), 82 (2024)
work page 2024
-
[11]
Journal of Machine Learning Research15(1), 3221–3245 (2014) 37
Maaten, L.: Accelerating t-SNE using tree-based algorithms. Journal of Machine Learning Research15(1), 3221–3245 (2014) 37
work page 2014
-
[12]
Nature methods16(8), 715–721 (2019)
Lotfollahi, M., Wolf, F.A., Theis, F.J.: scgen predicts single-cell perturbation responses. Nature methods16(8), 715–721 (2019)
work page 2019
-
[13]
Nature Machine Intelligence, 1–16 (2025)
Peng, D., Gui, Z., Wei, W., Li, F., Gui, J., Wu, H., Gong, J.: Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data. Nature Machine Intelligence, 1–16 (2025)
work page 2025
-
[14]
Accurate predictions on small data with a tab- ular foundation model.Nature, 637(8045):319–326, 2025
Hollmann, N., M¨ uller, S., Purucker, L., Krishnakumar, A., K¨ orfer, M., Hoo, S.B., Schirrmeis- ter, R.T., Hutter, F.: Accurate predictions on small data with a tabular foundation model. Nature637(8045), 319–326 (2025) https://doi.org/10.1038/s41586-024-08328-6
-
[15]
Neural Computation33(11), 2881–2907 (2021)
Sainburg, T., McInnes, L., Gentner, T.Q.: Parametric umap embeddings for representation and semisupervised learning. Neural Computation33(11), 2881–2907 (2021)
work page 2021
-
[16]
Horenko, I., Schmidt-Ehrenberg, J., Sch¨ utte, C.: Set-oriented dimension reduction: Local- izing principal component analysis via hidden markov models. In: R. Berthold, M., Glen, R.C., Fischer, I. (eds.) Computational Life Sciences II, pp. 74–85. Springer, Berlin, Heidelberg (2006)
work page 2006
-
[17]
Horenko, I., Klein, R., Dolaptchiev, S., Sch¨ utte, C.: Automated generation of reduced stochas- tic weather models i: Simultaneous dimension and model reduction for time series analysis. Multiscale Modeling & Simulation6(4), 1125–1145 (2008) https://doi.org/10.1137/060670535 https://doi.org/10.1137/060670535
-
[18]
Communications in Applied Mathematics and Computational Science7(2), 175– 229 (2012)
Metzner, P., Putzig, L., Horenko, I.: Analysis of persistent nonstationary time series and applications. Communications in Applied Mathematics and Computational Science7(2), 175– 229 (2012)
work page 2012
-
[19]
Sahni, S.: Computationally related problems. SIAM Journal on Computing3(4), 262–279 (1974) https://doi.org/10.1137/0203021 https://doi.org/10.1137/0203021
-
[20]
Neural Computation32(8), 1563–1579 (2020) https://doi.org/10.1162/ 38 neco a 01296
Horenko, I.: On a scalable entropic breaching of the overfitting barrier for small data problems in machine learning. Neural Computation32(8), 1563–1579 (2020) https://doi.org/10.1162/ 38 neco a 01296
work page 2020
-
[21]
Horenko, I.: Cheap robust learning of data anomalies with analytically solv- able entropic outlier sparsification. Proceedings of the National Academy of Sciences119(9), 2119659119 (2022) https://doi.org/10.1073/pnas.2119659119 https://www.pnas.org/doi/pdf/10.1073/pnas.2119659119
-
[22]
2011, Neural Comput., 23, 1661, 10.1162/NECO\_a\_00142
Vecchi, E., Posp´ ıˇ sil, L., Albrecht, S., O’Kane, T.J., Horenko, I.: eSPA+: Scal- able Entropy-Optimal Machine Learning Classification for Small Data Problems. Neural Computation34(5), 1220–1255 (2022) https://doi.org/10.1162/neco a 01490 https://direct.mit.edu/neco/article-pdf/34/5/1220/2008663/neco a 01490.pdf
-
[23]
Horenko, I., Vecchi, E., Kardoˇ s, J., W¨ achter, A., Schenk, O., O’Kane, T.J., Gagliardini, P., Gerber, S.: On cheap entropy-sparsified regression learning. Proceedings of the National Academy of Sciences120(1), 2214972120 (2023) https://doi.org/10.1073/pnas.2214972120 https://www.pnas.org/doi/pdf/10.1073/pnas.2214972120
-
[24]
arXiv preprint arXiv:2506.17940 (2025)
Bassetti, D., Posp´ ıˇ sil, L., Groom, M., O’Kane, T.J., Horenko, I.: An entropy-optimal path to humble ai. arXiv preprint arXiv:2506.17940 (2025)
-
[25]
Journal of Electronic Science and Technology17(1), 26–40 (2019)
Wu, J., Chen, X.-Y., Zhang, H., Xiong, L.-D., Lei, H., Deng, S.-H.: Hyperparameter opti- mization for machine learning models based on bayesian optimization. Journal of Electronic Science and Technology17(1), 26–40 (2019)
work page 2019
-
[26]
SIAM journal on scientific computing14(6), 1487–1503 (1993)
Hansen, P.C., O’Leary, D.P.: The use of the l-curve in the regularization of discrete ill-posed problems. SIAM journal on scientific computing14(6), 1487–1503 (1993)
work page 1993
-
[27]
Springer, New York, NY, USA (2013)
Burnham, K., Anderson, D.: Model Selection and Inference: a Practical Information-theoretic Approach. Springer, New York, NY, USA (2013)
work page 2013
-
[28]
Lorenz, E.N.: Predictability: a problem partly solved, 1–18 (1996)
work page 1996
-
[29]
Journal of the Atmospheric Sciences62(5), 1574–1587 (2005) https://doi.org/10.1175/JAS3430.1 39
Lorenz, E.N.: Designing chaotic models. Journal of the Atmospheric Sciences62(5), 1574–1587 (2005) https://doi.org/10.1175/JAS3430.1 39
-
[30]
Lorenz, E.N., Emanuel, K.A.: Optimal sites for supplementary weather observations: Exper- iments with a small model. Journal of the Atmospheric Sciences55(3), 399–414 (1998) https://doi.org/10.1175/1520-0469(1998)055⟨0399:OSFSWO⟩2.0.CO;2
-
[31]
Anderson, J.L.: An ensemble adjustment kalman filter for data assimilation. Monthly Weather Review129(12), 2884–2903 (2001) https://doi.org/10.1175/1520-0493(2001) 129⟨2884:AEAKFF⟩2.0.CO;2
-
[32]
Monthly Weather Review133(5), 1238–1250 (2005) https://doi.org/10.1175/ MWR2955.1
Houtekamer, P.L., Mitchell, H.L.: A sequential ensemble kalman filter for atmospheric data assimilation. Monthly Weather Review133(5), 1238–1250 (2005) https://doi.org/10.1175/ MWR2955.1
work page 2005
-
[33]
Foundations of Data Science2(1), 55–80 (2020) https://doi.org/10.3934/fods.2020004
Bocquet, M., Brajard, J., Carrassi, A., Bertino, L.: Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization. Foundations of Data Science2(1), 55–80 (2020) https://doi.org/10.3934/fods.2020004
-
[34]
Sch¨ utte, C., Sarich, M.: Metastability and Markov State Models in Molecular Dynamics. vol
-
[35]
American Mathematical Soc., New York, NY, USA (2013)
work page 2013
-
[36]
Handbook of numerical analysis10, 699–744 (2003)
Sch¨ utte, C., Huisinga, W.: Biomolecular conformations can be identified as metastable sets of molecular dynamics. Handbook of numerical analysis10, 699–744 (2003)
work page 2003
-
[37]
In: Proceedings of the International Congress of Mathematicians 2010 (ICM 2010) (In 4 Volumes) Vol
Djurdjevac, N., Sarich, M., Sch¨ utte, C.: On markov state models for metastable processes. In: Proceedings of the International Congress of Mathematicians 2010 (ICM 2010) (In 4 Volumes) Vol. I: Plenary Lectures and Ceremonies Vols. II–IV: Invited Lectures, pp. 3105–3131 (2010). World Scientific
work page 2010
-
[38]
The Journal of Chemical Physics 132(21), 214102 (2010)
Olascoaga, M.J., Balachandar, S.: Extensive chaos in the lorenz-96 model. Chaos: An Inter- disciplinary Journal of Nonlinear Science20(4), 043105 (2010) https://doi.org/10.1063/1. 3496397
work page doi:10.1063/1 2010
-
[39]
Physical Review Letters50(9), 682–686 (1983) 40
Hasegawa, A., Wakatani, M.: Plasma edge turbulence. Physical Review Letters50(9), 682–686 (1983) 40
work page 1983
-
[40]
Physics of Fluids27(3), 611–618 (1984)
Wakatani, M., Hasegawa, A.: A collisional drift wave description of plasma edge turbulence. Physics of Fluids27(3), 611–618 (1984)
work page 1984
-
[41]
Journal of Computational Physics197(1), 210–231 (2004)
Gottwald, G.A., Grimshaw, R.: Arakawa-like schemes for the hasegawa-wakatani equations. Journal of Computational Physics197(1), 210–231 (2004)
work page 2004
-
[42]
Physics of Plasmas14(10), 102312 (2007)
Numata, R., Ball, R., Dewar, R.L.: A modified hasegawa-wakatani model for plasma turbulence and zonal flows. Physics of Plasmas14(10), 102312 (2007)
work page 2007
-
[43]
International Journal of Chaos Theory and Applications6, 5–26 (2001)
Patil, D.J., Frenkel, M., Kermode, R.I.: Chaos in the lorenz 96 model: A thorough numerical study. International Journal of Chaos Theory and Applications6, 5–26 (2001)
work page 2001
-
[44]
Axelsen, A.R., O’Kane, T.J., Quinn, C.R., Bassom, A.P.: Hyperbolicity and south- ern hemisphere persistent synoptic events. Journal of Advances in Modeling Earth Systems17(4), 2024–004834 (2025) https://doi.org/10.1029/2024MS004834 https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2024MS004834. e2024MS004834 2024MS004834
-
[45]
Majda, A.J., Franzke, C.L., Fischer, A., Crommelin, D.T.: Distinct metastable atmospheric regimes despite nearly gaussian statistics: A paradigm model. Proceedings of the National Academy of Sciences103(22), 8309–8314 (2006) https://doi.org/10.1073/pnas.0602641103 https://www.pnas.org/doi/pdf/10.1073/pnas.0602641103
-
[46]
The Journal of chemical physics125(8) (2006)
Metzner, P., Sch¨ utte, C., Vanden-Eijnden, E.: Illustration of transition path theory on a collection of simple examples. The Journal of chemical physics125(8) (2006)
work page 2006
-
[47]
EURASIP Journal on Advances in Signal Processing2006(1), 083268 (2006)
Pham, T.Q., Van Vliet, L.J., Schutte, K.: Robust fusion of irregularly sampled data using adap- tive normalized convolution. EURASIP Journal on Advances in Signal Processing2006(1), 083268 (2006)
work page 2006
-
[48]
Acta Numerica32, 517–673 (2023) 41
Sch¨ utte, C., Klus, S., Hartmann, C.: Overcoming the timescale barrier in molecular dynamics: Transfer operators, variational principles and machine learning. Acta Numerica32, 517–673 (2023) 41
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.