Recognition: unknown
Information bottleneck for learning the phase space of dynamics from high-dimensional experimental data
Pith reviewed 2026-05-07 16:59 UTC · model grok-4.3
The pith
DySIB recovers the two-dimensional phase space of a pendulum from high-dimensional video data by maximizing predictive mutual information in latent space.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DySIB recovers a two-dimensional representation that matches the dimensionality, topology, and geometry of the pendulum phase space, with the learned coordinates aligning smoothly with the canonical angle and angular velocity. Hyperparameters are set self-consistently by the data, and the entire procedure operates in latent space to demonstrate that predictive information alone can yield interpretable dynamical coordinates directly from high-dimensional observations.
What carries the argument
The Dynamical Symmetric Information Bottleneck (DySIB), an objective that maximizes predictive mutual information between past and future latent windows while penalizing representation complexity.
If this is right
- The method recovers interpretable dynamical state variables from high-dimensional time-series data without any reconstruction loss or external labels.
- Hyperparameters of the encoder and bottleneck can be chosen self-consistently from the data itself.
- Success on a well-characterized experimental system shows that predictive information in latent space is enough to extract physically meaningful coordinates.
- The approach avoids direct reconstruction of observations, focusing computation entirely on the latent dynamical representation.
Where Pith is reading between the lines
- If the same objective works on systems whose phase space dimension is unknown in advance, it could serve as a general tool for discovering hidden state variables in experimental recordings.
- Applying the method to time series from chaotic or high-dimensional attractors would test whether it can recover non-trivial topology and geometry without prior knowledge.
- One could combine DySIB coordinates with downstream control algorithms to perform model-free stabilization or prediction directly from video.
- Extending the bottleneck to include multiple future horizons might improve robustness when the underlying dynamics contain multiple timescales.
Load-bearing premise
Maximizing predictive mutual information between past and future observation windows in latent space is sufficient to recover the true underlying dynamical state variables without additional supervision or reconstruction.
What would settle it
Running DySIB on the pendulum video dataset and verifying whether the resulting two-dimensional latent coordinates fail to vary smoothly with independently measured angle and angular velocity or produce a non-cylindrical topology.
Figures
read the original abstract
Identifying the dynamical state variables of a system from high-dimensional observations is a central problem across physical sciences. The challenge is that the state variables are not directly observable and must be inferred from raw high-dimensional data without supervision. Here we introduce DySIB (Dynamical Symmetric Information Bottleneck) as a method to learn low-dimensional representations of time-series data by maximizing predictive mutual information between past and future observation windows while penalizing representation complexity. This objective operates entirely in latent space and avoids reconstruction of the observations. We apply DySIB to an experimental video dataset of a physical pendulum, where the underlying state space is known. The method, with hyperparameters of the learning architecture set self-consistently by the data, recovers a two-dimensional representation that matches the dimensionality, topology, and geometry of the pendulum phase space, with the learned coordinates aligning smoothly with the canonical angle and angular velocity. These results demonstrate, on a well-characterized experimental system, that predictive information in latent space can be used to recover interpretable dynamical coordinates directly from high-dimensional data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces DySIB, a dynamical symmetric information bottleneck method for learning low-dimensional latent representations of time-series data. The approach maximizes predictive mutual information between past and future observation windows in latent space while applying a complexity penalty, operating without direct reconstruction of the high-dimensional observations. Applied to an experimental video dataset of a physical pendulum with known underlying state space, the method (with data-driven hyperparameter selection) is claimed to recover a two-dimensional representation whose dimensionality, topology, and geometry match the pendulum phase space, with the learned coordinates aligning smoothly to the canonical angle and angular velocity.
Significance. If the central claims hold under scrutiny, the work would represent a meaningful contribution to unsupervised extraction of interpretable dynamical coordinates from high-dimensional experimental data, with potential applications across physics and related fields. The avoidance of reconstruction and the use of self-consistent hyperparameter setting are positive features. However, the invariance of the predictive mutual information objective to diffeomorphisms of the latent variables means that specific smooth alignment with canonical coordinates is not automatically guaranteed by the information-bottleneck principle, which weakens the interpretability claim unless additional mechanisms are demonstrated.
major comments (2)
- [Abstract] Abstract and results description: the claim that the learned coordinates align smoothly with the canonical angle and angular velocity is load-bearing for the interpretability result, yet the objective (maximizing I(Z_past; Z_future) subject to a complexity penalty) is preserved under any invertible reparametrization of the latent variables. No mechanism (symmetry-breaking term, canonicalization step, or uniqueness argument) is identified in the provided description that would select this particular gauge over other equally optimal coordinate systems; the observed alignment could therefore stem from architectural biases, initialization, or post-hoc choices rather than the DySIB objective itself.
- [Results] Methods and results: the abstract reports successful recovery but provides no quantitative metrics (e.g., alignment error between learned and canonical coordinates, topological invariants such as winding numbers, or geometry measures such as curvature or metric distortion). Without these, or explicit external benchmarks independent of the learned representation, it is impossible to verify that the recovered 2D space matches the true phase space beyond qualitative visual inspection.
minor comments (2)
- [Abstract] The acronym DySIB is introduced without an immediate expansion in the abstract; this should be corrected for clarity.
- [Methods] Notation for the latent variables (Z_past, Z_future) and the precise form of the complexity penalty should be defined explicitly at first use in the methods section.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments. We address each major comment below and have revised the manuscript to clarify the claims and strengthen the supporting evidence.
read point-by-point responses
-
Referee: [Abstract] Abstract and results description: the claim that the learned coordinates align smoothly with the canonical angle and angular velocity is load-bearing for the interpretability result, yet the objective (maximizing I(Z_past; Z_future) subject to a complexity penalty) is preserved under any invertible reparametrization of the latent variables. No mechanism (symmetry-breaking term, canonicalization step, or uniqueness argument) is identified in the provided description that would select this particular gauge over other equally optimal coordinate systems; the observed alignment could therefore stem from architectural biases, initialization, or post-hoc choices rather than the DySIB objective itself.
Authors: We agree that the predictive mutual information objective is invariant under diffeomorphisms of the latent variables, and the original manuscript does not provide an explicit symmetry-breaking term, canonicalization procedure, or uniqueness theorem that would guarantee selection of the canonical gauge. The reported alignment is an empirical outcome of the training procedure. In the revised manuscript we have added a dedicated paragraph in the methods section acknowledging this invariance and discussing how the observed alignment arises consistently from the combination of the convolutional architecture, random initialization, and data-driven hyperparameter selection. We have also included results from ten independent training runs with different random seeds, showing that the alignment with angle and angular velocity is robust (with alignment error remaining below a stated threshold after optimal affine matching). revision: yes
-
Referee: [Results] Methods and results: the abstract reports successful recovery but provides no quantitative metrics (e.g., alignment error between learned and canonical coordinates, topological invariants such as winding numbers, or geometry measures such as curvature or metric distortion). Without these, or explicit external benchmarks independent of the learned representation, it is impossible to verify that the recovered 2D space matches the true phase space beyond qualitative visual inspection.
Authors: We concur that quantitative metrics are necessary to move beyond qualitative visual assessment. The revised manuscript now includes three explicit quantitative measures: (i) the root-mean-square alignment error between the learned coordinates and the canonical angle/angular-velocity after determining the optimal affine transformation, (ii) the winding numbers of closed orbits in the latent space to confirm topological equivalence, and (iii) a local geometry comparison that quantifies metric distortion relative to the known pendulum phase-space metric. These statistics are reported in a new table and are computed on held-out test trajectories independent of the training data. revision: yes
Circularity Check
No significant circularity in empirical demonstration of DySIB
full rationale
The paper defines DySIB via maximization of predictive mutual information between past and future latent windows (with complexity penalty) and applies it to experimental pendulum video data. The central result is an empirical match between the learned 2D latent representation and the known pendulum phase space (dimension, topology, geometry, and smooth alignment with angle/velocity). This match is validated against an external, independently known ground truth rather than being derived from the training objective by construction. Hyperparameter selection is described as self-consistent with the data, but this is a standard model-selection step and does not reduce the reported alignment to a tautology. No load-bearing self-citations, uniqueness theorems, or ansatzes that presuppose the target coordinates appear in the abstract or claimed derivation; the method remains falsifiable against the physical system.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
At each timet, we construct a pair of variables by taking consecutive segments of this trajectory in the observation space
Delayed embeddings and the shared encoder We start with observations of a dynamical system as a sequence of high-dimensional frames (for example, image frames of a video){F 1, F2,· · · }, with each frameF t ∈ RD, whereDis the dimensionality of the observation space. At each timet, we construct a pair of variables by taking consecutive segments of this tra...
-
[2]
A. L. Hodgkin and A. F. Huxley, A quantitative descrip- tion of membrane current and its application to conduc- tion and excitation in nerve, The Journal of Physiology 117, 500 (1952)
1952
-
[3]
Toner and Y
J. Toner and Y. Tu, Long-range order in a two- dimensional dynamical XY model: how birds fly to- gether, Physical Review Letters75, 4326 (1995)
1995
-
[4]
Goldenfeld,Lectures on Phase Transitions and the Renormalization Group(CRC Press, 2018)
N. Goldenfeld,Lectures on Phase Transitions and the Renormalization Group(CRC Press, 2018)
2018
-
[5]
Cavagna, L
A. Cavagna, L. Di Carlo, I. Giardina, T. S. Grigera, S. Melillo, L. Parisi, G. Pisegna, and M. Scandolo, Natu- ral swarms in 3.99 dimensions, Nature Physics19, 1043 (2023)
2023
-
[6]
B. C. Daniels, W. S. Ryu, and I. Nemenman, Automated, predictive, and interpretable inference ofCaenorhabdi- tis elegansescape dynamics, Proceedings of the National Academy of Sciences116, 7226 (2019)
2019
-
[7]
Bapst, T
V. Bapst, T. Keck, A. Grabska-Barwinska, C. Donner, E. D. Cubuk, S. S. Schoenholz, A. Obika, A. W. R. Nel- son, T. Back, D. Hassabis, and P. Kohli, Unveiling the predictive power of static structure in glassy systems, Na- ture Physics16, 448 (2020)
2020
-
[8]
M. S. Schmitt, J. Colen, S. Sala, J. Devany, S. Seethara- man, A. Caillier, M. L. Gardel, P. W. Oakes, and V. Vitelli, Machine learning interpretable models of cell mechanics from protein images, Cell187, 481 (2024)
2024
-
[9]
W. Yu, E. Abdelaleem, I. Nemenman, and J. C. Burton, Physics-tailored machine learning reveals unexpected physics in dusty plasmas, Proceedings of the National Academy of Sciences122, e2505725122 (2025)
2025
-
[10]
G. J. Stephens, B. Johnson-Kerner, W. Bialek, and W. S. Ryu, Dimensionality and dynamics in the behavior of C. elegans, PLOS Computational Biology4, e1000028 (2008)
2008
-
[11]
J. P. Cunningham and B. M. Yu, Dimensionality re- duction for large-scale neural recordings, Nature Neuro- science17, 1500 (2014)
2014
-
[12]
S. S. Schoenholz, E. D. Cubuk, D. M. Sussman, E. Kaxi- ras, and A. J. Liu, A structural approach to relaxation in glassy liquids, Nature Physics12, 469 (2016)
2016
-
[13]
No´ e and C
F. No´ e and C. Clementi, Collective variables for the study of long-time kinetics from molecular trajectories: theory and methods, Current Opinion in Structural Biology43, 141 (2017)
2017
-
[14]
E. D. Cubuk, R. J. S. Ivancic, S. S. Schoenholz, D. J. Strickland, A. Basu, Z. S. Davidson, J. Fontaine, J. L. Hor, Y.-R. Huang, Y. Jiang, N. C. Keim, K. D. Koshi- gan, J. A. Lefever, T. Liu, X.-G. Ma, D. J. Magagnosc, E. Morrow, C. P. Ortiz, J. M. Rieser, A. Shavit, T. Still, Y. Xu, Y. Zhang, K. N. Nordstrom, P. E. Arratia, R. W. Carpick, D. J. Durian, Z...
2017
-
[15]
Pandarinath, D
C. Pandarinath, D. J. O’Shea, J. Collins, R. Jozefow- icz, S. D. Stavisky, J. C. Kao, E. M. Trautmann, M. T. Kaufman, S. I. Ryu, L. R. Hochberg, J. M. Henderson, K. V. Shenoy, L. F. Abbott, and D. Sussillo, Inferring single-trial neural population dynamics using sequential auto-encoders, Nature Methods15, 805 (2018). 11
2018
-
[16]
Ahamed, A
T. Ahamed, A. C. Costa, and G. J. Stephens, Capturing the continuous complexity of behaviour inCaenorhabditis elegans, Nature Physics17, 275 (2021)
2021
-
[17]
Colen, M
J. Colen, M. Han, R. Zhang, S. A. Redford, L. M. Lemma, L. Morgan, P. V. Ruijgrok, R. Adkins, Z. Bryant, Z. Dogic, M. L. Gardel, J. J. de Pablo, and V. Vitelli, Machine learning active-nematic hydrodynam- ics, Proceedings of the National Academy of Sciences 118, e2016708118 (2021)
2021
-
[18]
Supekar, B
R. Supekar, B. Song, A. Hastewell, G. P. T. Choi, A. Mi- etke, and J. Dunkel, Learning hydrodynamic equations for active matter from particle simulations and experi- ments, Proceedings of the National Academy of Sciences 120, e2206994120 (2023)
2023
-
[19]
Schmidt and H
M. Schmidt and H. Lipson, Distilling free-form natural laws from experimental data, Science324, 81 (2009)
2009
-
[20]
B. C. Daniels and I. Nemenman, Automated adaptive inference of phenomenological dynamical models, Nature Communications6, 8133 (2015)
2015
-
[21]
S. L. Brunton, J. L. Proctor, and J. N. Kutz, Discover- ing governing equations from data by sparse identifica- tion of nonlinear dynamical systems, Proceedings of the National Academy of Sciences113, 3932 (2016)
2016
-
[22]
N. M. Mangan, T. Askham, S. L. Brunton, J. N. Kutz, and J. L. Proctor, Model selection for hybrid dynamical systems via sparse regression, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sci- ences475, 20180534 (2019)
2019
-
[23]
Frishman and P
A. Frishman and P. Ronceray, Learning force fields from stochastic trajectories, Physical Review X10, 021009 (2020)
2020
-
[24]
P. A. K. Reinbold, L. M. Kageorge, M. F. Schatz, and R. O. Grigoriev, Robust learning from noisy, incomplete, high-dimensional experimental data via physically con- strained symbolic regression, Nature Communications 12, 3219 (2021)
2021
-
[25]
D. R. Gurevich, M. R. Golden, P. A. K. Reinbold, and R. O. Grigoriev, Learning fluid physics from highly tur- bulent data using sparse physics-informed discovery of empirical relations (SPIDER), Journal of Fluid Mechan- ics996, A25 (2024)
2024
-
[26]
Lusch, J
B. Lusch, J. N. Kutz, and S. L. Brunton, Deep learning for universal linear embeddings of nonlinear dynamics, Nature Communications9, 4950 (2018)
2018
-
[27]
Champion, B
K. Champion, B. Lusch, J. N. Kutz, and S. L. Brunton, Data-driven discovery of coordinates and governing equa- tions, Proceedings of the National Academy of Sciences 116, 22445 (2019)
2019
-
[28]
A. J. Linot and M. D. Graham, Deep learning to discover and predict dynamics on an inertial manifold, Physical Review E101, 062209 (2020)
2020
-
[29]
J. Page, M. P. Brenner, and R. R. Kerswell, Revealing the state space of turbulence using machine learning, Physi- cal Review Fluids6, 034402 (2021)
2021
-
[30]
B. Chen, K. Huang, S. Raghupathi, I. Chandratreya, Q. Du, and H. Lipson, Automated discovery of funda- mental variables hidden in experimental data, Nature Computational Science2, 433 (2022)
2022
-
[31]
P. R. Vlachas, G. Arampatzis, C. Uhler, and P. Koumoutsakos, Multiscale simulations of complex sys- tems by learning their effective dynamics, Nature Ma- chine Intelligence4, 359 (2022)
2022
-
[32]
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sas- try, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D....
work page internal anchor Pith review arXiv 2020
- [33]
-
[34]
R. Lam, A. Sanchez-Gonzalez, M. Willson, P. Wirns- berger, M. Fortunato, F. Alet, S. Ravuri, T. Ewalds, Z. Eaton-Rosen, W. Hu, A. Merose, S. Hoyer, G. Hol- land, O. Vinyals, J. Stott, A. Pritzel, S. Mohamed, and P. Battaglia, Learning skillful medium-range global weather forecasting, Science382, 1416 (2023)
2023
-
[35]
E. Abdelaleem, A. Roman, K. M. Martini, and I. Ne- menman, Simultaneous dimensionality reduction: A data efficient approach for multimodal representations learn- ing, Transactions on Machine Learning Research (2024), arXiv:2310.04458
-
[36]
Better Together: Cross and Joint Covariances Enhance Signal Detectability in Undersampled Data
A. Swain, S. A. Ridout, and I. Nemenman, Better together: Cross and joint covariances enhance signal detectability in undersampled data, arXiv:2507.22207 [cond-mat.dis-nn] (2025), arXiv:2507.22207 [cond- mat.dis-nn]
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[37]
P. Mergny and L. Zdeborov´ a, Spectral thresholds in cor- related spiked models and fundamental limits of partial least squares, inProceedings of the 29th International Conference on Artificial Intelligence and Statistics (AIS- TATS)(2026) arXiv:2510.17561 [math.ST]
- [38]
-
[39]
Wiskott and T
L. Wiskott and T. J. Sejnowski, Slow feature analysis: Unsupervised learning of invariances, Neural Computa- tion14, 715 (2002)
2002
-
[40]
Lejepa: Provable and scalable self-supervised learning without the heuristics, 2025
R. Balestriero and Y. LeCun, LeJEPA: Provable and scalable self-supervised learning without the heuris- tics, arXiv:2511.08544 [cs.LG] (2025), arXiv:2511.08544 [cs.LG]
-
[41]
L. Maes, Q. Le Lidec, D. Scieur, Y. LeCun, and R. Balestriero, LeWorldModel: Stable end-to- end joint-embedding predictive architecture from pix- els, arXiv:2603.19312 [cs.LG] (2026), arXiv:2603.19312 [cs.LG]
work page internal anchor Pith review arXiv 2026
-
[42]
K. M. Martini and I. Nemenman, Data efficiency, dimen- sionality reduction, and the generalized symmetric infor- mation bottleneck, Neural Computation36, 1353 (2024)
2024
-
[43]
H. Van Assel, M. Ibrahim, T. Biancalani, A. Regev, and R. Balestriero, Joint embedding vs reconstruction: Provable benefits of latent space prediction for self- supervised learning, arXiv:2505.12477 [cs.LG] (2025), arXiv:2505.12477 [cs.LG]
-
[44]
Representation Learning with Contrastive Predictive Coding
A. van den Oord, Y. Li, and O. Vinyals, Rep- resentation learning with contrastive predictive cod- ing, arXiv:1807.03748 [cs.LG] (2018), arXiv:1807.03748 [cs.LG]
work page internal anchor Pith review arXiv 2018
- [45]
-
[46]
M. S. Schmitt, M. Koch-Janusz, M. Fruchart, D. S. Seara, M. Rust, and V. Vitelli, Infor- mation theory for data-driven model reduction in physics and biology, bioRxiv:2024.04.19.590281 (2024), bioRxiv:2024.04.19.590281
2024
-
[47]
Meng and K
R. Meng and K. E. Bouchard, Bayesian inference of struc- tured latent spaces from neural population activity with the orthogonal stochastic linear mixing model, PLOS Computational Biology20, e1011975 (2024)
2024
-
[48]
Abdelaleem, I
E. Abdelaleem, I. Nemenman, and K. M. Martini, Deep variational multivariate information bottleneck— a framework for variational losses, Journal of Machine Learning Research26, 1 (2025)
2025
-
[49]
Tishby, F
N. Tishby, F. C. Pereira, and W. Bialek, The information bottleneck method, in37th Annual Allerton Conference on Communication, Control, and Computing(1999) pp. 368–377
1999
-
[50]
arXiv preprint arXiv:1612.00410 , year=
A. Alemi, I. Fischer, J. Dillon, and K. Murphy, Deep variational information bottleneck, inInterna- tional Conference on Learning Representations(2017) arXiv:1612.00410 [cs.LG]
-
[51]
T. M. Cover and J. A. Thomas,Elements of Information Theory, 2nd ed. (Wiley-Interscience, 2006)
2006
-
[52]
Friedman, O
N. Friedman, O. Mosenzon, N. Slonim, and N. Tishby, Multivariate information bottleneck, inProceedings of the 17th Conference on Uncertainty in Artificial Intel- ligence (UAI)(2001) pp. 152–161
2001
-
[53]
Studen´ y and J
M. Studen´ y and J. Vejnarov´ a, The multiinformation function as a tool for measuring stochastic dependence, inLearning in Graphical Models, edited by M. I. Jordan (Springer, 1998) pp. 261–297
1998
-
[54]
M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul, An introduction to variational methods for graphi- cal models, Machine Learning37, 183 (1999)
1999
-
[55]
D. P. Kingma and M. Welling, Auto-encoding variational bayes, inInternational Conference on Learning Represen- tations(2014) arXiv:1312.6114 [stat.ML]
work page internal anchor Pith review arXiv 2014
-
[56]
E. Abdelaleem, K. M. Martini, and I. Nemenman, Ac- curate estimation of mutual information in high dimen- sional data, arXiv:2506.00330 [physics.data-an] (2025), arXiv:2506.00330 [physics.data-an]
-
[57]
Takens, Detecting strange attractors in turbulence, inDynamical Systems and Turbulence, Warwick 1980, Lecture Notes in Mathematics, Vol
F. Takens, Detecting strange attractors in turbulence, inDynamical Systems and Turbulence, Warwick 1980, Lecture Notes in Mathematics, Vol. 898 (Springer, 1981) pp. 366–381
1980
-
[58]
R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, Neural ordinary differential equations, Ad- vances in Neural Information Processing Systems31 (2018), arXiv:1806.07366 [cs.LG]
work page internal anchor Pith review arXiv 2018
-
[59]
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learn- ing for image recognition, inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016) pp. 770–778
2016
- [60]
-
[61]
N. H. Packard, J. P. Crutchfield, J. D. Farmer, and R. S. Shaw, Geometry from a time series, Physical Review Let- ters45, 712 (1980)
1980
-
[62]
Eckmann and D
J.-P. Eckmann and D. Ruelle, Ergodic theory of chaos and strange attractors, Reviews of Modern Physics57, 617 (1985)
1985
-
[63]
J. P. Crutchfield and B. S. McNamara, Equations of mo- tion from a data series, Complex Systems1, 417 (1987)
1987
-
[64]
Sugihara and R
G. Sugihara and R. M. May, Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series, Nature344, 734 (1990)
1990
-
[65]
M. B. Kennel, R. Brown, and H. D. I. Abarbanel, Deter- mining embedding dimension for phase-space reconstruc- tion using a geometrical construction, Physical Review A 45, 3403 (1992)
1992
-
[66]
Ushio, C.-H
M. Ushio, C.-H. Hsieh, R. Masuda, E. R. Deyle, H. Ye, C.-W. Chang, G. Sugihara, and M. Kondoh, Fluctuat- ing interaction network and time-varying stability of a natural fish community, Nature554, 360 (2018)
2018
-
[67]
Grassberger, Toward a quantitative theory of self- generated complexity, International Journal of Theoreti- cal Physics25, 907 (1986)
P. Grassberger, Toward a quantitative theory of self- generated complexity, International Journal of Theoreti- cal Physics25, 907 (1986)
1986
-
[68]
Bialek, I
W. Bialek, I. Nemenman, and N. Tishby, Predictability, complexity, and learning, Neural Computation13, 2409 (2001)
2001
-
[69]
Creutzig, A
F. Creutzig, A. Globerson, and N. Tishby, Past-future information bottleneck in dynamical systems, Physical Review E79, 041925 (2009)
2009
-
[70]
Assran, Q
M. Assran, Q. Duval, I. Misra, P. Bojanowski, P. Vincent, M. Rabbat, Y. LeCun, and N. Ballas, Self-supervised learning from images with a joint-embedding predictive architecture, inProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition(2023) pp. 15619–15629
2023
-
[71]
M. M. Peixoto, Structural stability on two-dimensional manifolds, Topology1, 101 (1962)
1962
-
[72]
Hyv¨ arinen and P
A. Hyv¨ arinen and P. Pajunen, Nonlinear independent component analysis: Existence and uniqueness results, Neural Networks12, 429 (1999)
1999
-
[73]
O. Yair, R. Talmon, R. R. Coifman, and I. G. Kevrekidis, Reconstruction of normal forms by learning informed ob- servation geometries from data, Proceedings of the Na- tional Academy of Sciences114, E7865 (2017)
2017
-
[74]
Li, C.-X
S.-H. Li, C.-X. Dong, L. Zhang, and L. Wang, Neural canonical transformation with symplectic flows, Physical Review X10, 021020 (2020)
2020
-
[75]
M. D. Donsker and S. R. S. Varadhan, Asymptotic eval- uation of certain markov process expectations for large time. IV, Communications on Pure and Applied Mathe- matics36, 183 (1983)
1983
-
[76]
Poole, S
B. Poole, S. Ozair, A. Van Den Oord, A. Alemi, and G. Tucker, On variational bounds of mutual information, inInt. Conf. Machine Learning(2019) pp. 5171–5180
2019
-
[77]
Levina and P
E. Levina and P. Bickel, Maximum likelihood estimation of intrinsic dimension, Advances in Neural Information Processing Systems17(2004)
2004
-
[78]
Facco, M
E. Facco, M. d’Errico, A. Rodriguez, and A. Laio, Esti- mating the intrinsic dimension of datasets by a minimal neighborhood information, Scientific Reports7, 12140 (2017). 13 Appendix A: Implementation and evaluation
2017
-
[79]
The concatenated delayed embedding [Φ(Ft),
Architecture The shared encoder Φ is a three-layer MLP with hid- den width 256 and ReLU activations that maps each frameF t ∈R 784 to a per-frame embedding of dimen- siond F = 32. The concatenated delayed embedding [Φ(Ft), . . . ,Φ(Ft+nF −1)]∈R 32nF is passed through two parallel linear headsW µ andW ℓ producing the mean µ(x)∈R kz and log-varianceℓ(x)∈R k...
-
[80]
We downsample the original 128×128 RGB frames to 28×28 grayscale (D= 784); each video containsT= 60 frames, resulting inT−2n F + 1 valid past-future pairs per trajectory
Training We train on the experimental pendulum dataset [29], using up to the first 1000 videos for training and the final 200 for held-out evaluation. We downsample the original 128×128 RGB frames to 28×28 grayscale (D= 784); each video containsT= 60 frames, resulting inT−2n F + 1 valid past-future pairs per trajectory. We train with the Adam optimizer wi...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.