pith. sign in

arxiv: 2606.15495 · v2 · pith:GIBDVRY2new · submitted 2026-06-13 · ⚛️ physics.comp-ph · physics.chem-ph

Contrastive learning of dynamical representations for enhanced molecular sampling

Pith reviewed 2026-06-27 03:24 UTC · model grok-4.3

classification ⚛️ physics.comp-ph physics.chem-ph
keywords contrastive learningcollective variablesmolecular dynamicsenhanced samplingtime-lagged independent component analysisself-supervised learningrare eventsdynamical representations
0
0 comments X

The pith

SelfTICA reformulates collective-variable discovery as self-supervised dynamical representation learning from time-lagged configurations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SelfTICA to identify collective variables that capture slow dynamical modes essential for sampling rare events in molecular systems. Existing machine-learning methods typically require predefined metastable states, chosen descriptors, or high-quality kinetic training data, but SelfTICA instead defines positive and negative pairs directly from time-lagged configurations. It learns reusable features via a contrastive objective connected to spectral variational principles, then applies time-lagged independent component analysis in the learned space to extract orthogonal slow modes. This decoupling avoids direct eigendecomposition optimization and permits evaluation of spectra and variables at multiple lag times without retraining. The resulting collective variables accelerate rare-event exploration and improve free-energy convergence when tested across different atomistic systems using limited, biased, or exploratory data.

Core claim

SelfTICA is a self-supervised contrastive-learning framework that reformulates collective-variable discovery as dynamical representation learning. It defines positive and negative pairs from time-lagged molecular configurations, learns reusable features through a contrastive objective linked to spectral variational principles, and extracts orthogonal slow modes by applying time-lagged independent component analysis in the learned representation space. By decoupling representation learning from slow-mode extraction, SelfTICA avoids direct optimization of eigendecomposition-based objectives and enables spectra and collective variables to be evaluated across lag times without retraining. Across

What carries the argument

SelfTICA framework, which decouples contrastive representation learning on time-lagged pairs from subsequent time-lagged independent component analysis for slow-mode extraction.

If this is right

  • Spectra and collective variables can be evaluated across multiple lag times without retraining the representation model.
  • The method functions on limited, biased, or exploratory training trajectories rather than requiring high-quality kinetic data.
  • Direct optimization of eigendecomposition objectives is avoided through the decoupled contrastive-plus-TICA pipeline.
  • Rare-event exploration accelerates and free-energy convergence improves when the extracted variables are used in enhanced sampling across atomistic systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Representations learned on one molecular system might transfer to related but distinct systems without full retraining.
  • SelfTICA could be combined with other enhanced-sampling techniques such as metadynamics to further improve efficiency.
  • The approach may lower dependence on expert-chosen input descriptors for identifying slow dynamics in new systems.
  • Scalability to larger biomolecular or material systems remains an open question that could be tested directly.

Load-bearing premise

Positive and negative pairs defined solely from time-lagged molecular configurations are sufficient to capture the relevant slow dynamical modes without requiring predefined metastable states or high-quality kinetic information.

What would settle it

A benchmark molecular system with known slow modes where collective variables produced by SelfTICA yield no measurable acceleration in rare-event sampling or free-energy convergence compared with standard descriptors or direct TICA on raw coordinates.

Figures

Figures reproduced from arXiv: 2606.15495 by Jintu Zhang, Kai Zhu, Luigi Bonati, Pietro Novelli, Tingjun Hou.

Figure 1
Figure 1. Figure 1: FIG. 1 [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2 [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3 [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4 [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6 [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5 [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
read the original abstract

Identifying collective variables that capture slow dynamical modes is essential for sampling rare events in complex systems. Existing machine-learning approaches often require predefined metastable states, carefully chosen descriptors, or training trajectories with high-quality kinetic information. Here, we introduce SelfTICA, a self-supervised contrastive-learning framework that reformulates collective-variable discovery as dynamical representation learning. SelfTICA defines positive and negative pairs from time-lagged molecular configurations, learns reusable features through a contrastive objective linked to spectral variational principles, and extracts orthogonal slow modes by applying time-lagged independent component analysis in the learned representation space. By decoupling representation learning from slow-mode extraction, SelfTICA avoids direct optimization of eigendecomposition-based objectives and enables spectra and collective variables to be evaluated across lag times without retraining. Across different atomistic systems, SelfTICA learns dynamical representations from limited, biased, or exploratory data and converts them into collective variables that accelerate rare-event exploration and improve free-energy convergence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces SelfTICA, a self-supervised contrastive-learning framework for collective-variable discovery in molecular dynamics. Positive and negative pairs are constructed from time-lagged configurations; a contrastive objective linked to spectral variational principles is used to learn reusable features; time-lagged independent component analysis is then applied in the learned representation space to extract orthogonal slow modes. These modes are converted into collective variables that accelerate rare-event sampling. The approach is presented as applicable to limited, biased, or exploratory data without requiring predefined metastable states or high-quality kinetic information, and is claimed to improve free-energy convergence across atomistic systems while allowing spectra and collective variables to be evaluated across lag times without retraining.

Significance. If the central claims hold, the work supplies a self-supervised route to dynamical representation learning that decouples feature extraction from eigendecomposition-based optimization. The linkage of the contrastive loss to spectral variational principles and the subsequent use of TICA in the learned space constitute a clear technical contribution. The ability to operate on biased or exploratory trajectories without metastable labels is a practical strength that could broaden the applicability of enhanced-sampling methods in computational physics and chemistry.

minor comments (2)
  1. The abstract states performance improvements across systems but supplies no numerical values, error bars, or baseline comparisons; adding one or two quantitative highlights would strengthen the summary.
  2. Notation for the contrastive loss and the precise form of the positive/negative pair construction should be introduced with an equation number in the methods section for immediate reference.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript and for recommending minor revision. The referee's description of SelfTICA accurately reflects the core technical contributions, including the contrastive formulation, decoupling of representation learning from eigendecomposition, and applicability to limited or biased trajectories. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation defines positive/negative pairs from time-lagged configurations, links the contrastive loss to external spectral variational principles, then applies standard TICA in the learned space. Representation learning is explicitly decoupled from eigendecomposition, and no equation or claim reduces the extracted slow modes or collective variables to a fit of the target quantities themselves. The construction is self-contained against the stated external principles and does not rely on self-citation chains or renaming of known results for its central claims.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Because only the abstract is available, the ledger is necessarily incomplete; the method appears to rest on standard domain assumptions in dynamical systems and contrastive learning rather than new invented entities.

free parameters (2)
  • lag time for pair construction
    The time lag used to define positive pairs is a modeling choice that directly affects which dynamical modes are emphasized.
  • contrastive loss hyperparameters
    Temperature or margin parameters typical of contrastive objectives are expected to be chosen or tuned.
axioms (2)
  • domain assumption Time-lagged configurations encode slow dynamical modes
    The framework defines positive pairs from time lags under this premise.
  • domain assumption Contrastive objective approximates spectral variational principles
    The abstract states the objective is linked to these principles without deriving the link.

pith-pipeline@v0.9.1-grok · 5700 in / 1386 out tokens · 44856 ms · 2026-06-27T03:24:07.895751+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

120 extracted references · 1 linked inside Pith

  1. [1]

    2023 , publisher=

    Understanding molecular simulation: from algorithms to applications , author=. 2023 , publisher=

  2. [2]

    Annual review of physical chemistry , volume=

    Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint , author=. Annual review of physical chemistry , volume=. 2016 , publisher=

  3. [3]

    Shirts and Omar Valsson and Lucie Delemotte , title =

    Jerome Henin and Tony Lelievre and Michael R. Shirts and Omar Valsson and Lucie Delemotte , title =. Living Journal of Computational Molecular Science , volume =. 2022 , URL =

  4. [4]

    Journal of computational physics , volume=

    Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling , author=. Journal of computational physics , volume=. 1977 , publisher=

  5. [5]

    Proceedings of the national academy of sciences , volume=

    Escaping free-energy minima , author=. Proceedings of the national academy of sciences , volume=. 2002 , publisher=

  6. [6]

    Physical review letters , volume=

    Well-tempered metadynamics: a smoothly converging and tunable free-energy method , author=. Physical review letters , volume=. 2008 , publisher=

  7. [7]

    Physical review letters , volume=

    Variational approach to enhanced sampling and free energy calculations , author=. Physical review letters , volume=. 2014 , publisher=

  8. [8]

    Proceedings of the National Academy of Sciences , volume=

    Neural networks-based variationally enhanced sampling , author=. Proceedings of the National Academy of Sciences , volume=. 2019 , publisher=

  9. [9]

    The journal of physical chemistry letters , volume=

    Rethinking metadynamics: from bias potentials to probability distributions , author=. The journal of physical chemistry letters , volume=. 2020 , publisher=

  10. [10]

    Physical Review X , volume=

    Unified approach to enhanced sampling , author=. Physical Review X , volume=. 2020 , publisher=

  11. [11]

    Journal of Chemical Theory and Computation , volume=

    Exploration vs convergence speed in adaptive-bias enhanced sampling , author=. Journal of Chemical Theory and Computation , volume=. 2022 , publisher=

  12. [12]

    Annual review of physical chemistry , volume=

    Reaction coordinates and mechanistic hypothesis tests , author=. Annual review of physical chemistry , volume=. 2016 , publisher=

  13. [13]

    arXiv preprint arXiv:2410.18019 , year=

    Advanced simulations with PLUMED: OPES and Machine Learning Collective Variables , author=. arXiv preprint arXiv:2410.18019 , year=

  14. [14]

    Journal of chemical theory and computation , volume=

    Chasing collective variables using autoencoders and biased trajectories , author=. Journal of chemical theory and computation , volume=. 2021 , publisher=

  15. [15]

    The journal of physical chemistry letters , volume=

    Data-driven collective variables for enhanced sampling , author=. The journal of physical chemistry letters , volume=. 2020 , publisher=

  16. [16]

    The Journal of Physical Chemistry Letters , volume=

    From enhanced sampling to reaction profiles , author=. The Journal of Physical Chemistry Letters , volume=. 2021 , publisher=

  17. [17]

    Proceedings of the National Academy of Sciences , volume=

    Deep learning the slow modes for rare events sampling , author=. Proceedings of the National Academy of Sciences , volume=. 2021 , publisher=

  18. [18]

    The Journal of chemical physics , volume=

    Transition path sampling of rare events by shooting from the top , author=. The Journal of chemical physics , volume=. 2017 , publisher=

  19. [19]

    Journal of Chemical Theory and Computation , volume=

    Combining transition path sampling with data-driven collective variables through a reactivity-biased shooting algorithm , author=. Journal of Chemical Theory and Computation , volume=. 2024 , publisher=

  20. [20]

    Advances in Neural Information Processing Systems , volume=

    Transfer learning for atomistic simulations using GNNs and kernel mean embeddings , author=. Advances in Neural Information Processing Systems , volume=

  21. [21]

    Nature Computational Science , pages=

    Computing the committor with the committor to study the transition state ensemble , author=. Nature Computational Science , pages=. 2024 , publisher=

  22. [22]

    Nature Computational Science , pages=

    Everything everywhere all at once: a probability-based enhanced sampling approach to rare events , author=. Nature Computational Science , pages=. 2025 , publisher=

  23. [23]

    The Journal of chemical physics , volume=

    Nonlinear discovery of slow molecular modes using state-free reversible VAMPnets , author=. The Journal of chemical physics , volume=. 2019 , publisher=

  24. [24]

    Nature Computational Science , volume=

    Machine-guided path sampling to discover mechanisms of molecular self-organization , author=. Nature Computational Science , volume=. 2023 , publisher=

  25. [25]

    International conference on machine learning , pages=

    Neural message passing for quantum chemistry , author=. International conference on machine learning , pages=. 2017 , organization=

  26. [26]

    The Journal of Chemical Physics , volume=

    Schnet--a deep learning architecture for molecules and materials , author=. The Journal of Chemical Physics , volume=. 2018 , publisher=

  27. [27]

    arXiv preprint arXiv:2009.01411 , year=

    Learning from protein structure with geometric vector perceptrons , author=. arXiv preprint arXiv:2009.01411 , year=

  28. [28]

    arXiv preprint arXiv:2106.03843 , year=

    Equivariant graph neural networks for 3d macromolecular structure , author=. arXiv preprint arXiv:2106.03843 , year=

  29. [29]

    International conference on machine learning , pages=

    On the expressive power of geometric graph neural networks , author=. International conference on machine learning , pages=. 2023 , organization=

  30. [30]

    Acta numerica , volume=

    Radial basis functions , author=. Acta numerica , volume=. 2000 , publisher=

  31. [31]

    arXiv preprint arXiv:1710.10903 , year=

    Graph attention networks , author=. arXiv preprint arXiv:1710.10903 , year=

  32. [32]

    Current opinion in structural biology , volume=

    Collective variables for the study of long-time kinetics from molecular trajectories: theory and methods , author=. Current opinion in structural biology , volume=. 2017 , publisher=

  33. [33]

    The Journal of Chemical Physics , volume=

    Identification of simple reaction coordinates from complex dynamics , author=. The Journal of Chemical Physics , volume=. 2017 , publisher=

  34. [34]

    The Journal of Physical Chemistry A , volume=

    Girsanov Reweighting Enhanced Sampling Technique (GREST): On-the-fly data-driven discovery of and enhanced sampling in slow collective variables , author=. The Journal of Physical Chemistry A , volume=. 2023 , publisher=

  35. [35]

    The Journal of chemical physics , volume=

    Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: the case of domain motions , author=. The Journal of chemical physics , volume=. 2011 , publisher=

  36. [36]

    The Journal of chemical physics , volume=

    Markov models of molecular kinetics: Generation and validation , author=. The Journal of chemical physics , volume=. 2011 , publisher=

  37. [37]

    The Journal of chemical physics , volume=

    Identification of slow molecular order parameters for Markov model construction , author=. The Journal of chemical physics , volume=. 2013 , publisher=

  38. [38]

    Journal of chemical theory and computation , volume=

    Improvements in Markov state model construction reveal many non-native interactions in the folding of NTL9 , author=. Journal of chemical theory and computation , volume=. 2013 , publisher=

  39. [39]

    Journal of chemical theory and computation , volume=

    Variational approach to molecular kinetics , author=. Journal of chemical theory and computation , volume=. 2014 , publisher=

  40. [40]

    The Fourteenth International Conference on Learning Representations , year=

    Self-Supervised Evolution Operator Learning for High-Dimensional Dynamical Systems , author=. The Fourteenth International Conference on Learning Representations , year=

  41. [41]

    Physical review letters , volume=

    Separation of a mixture of independent signals using time delayed correlations , author=. Physical review letters , volume=. 1994 , publisher=

  42. [42]

    Journal of chemical theory and computation , volume=

    Modeling molecular kinetics with tICA and the kernel trick , author=. Journal of chemical theory and computation , volume=. 2015 , publisher=

  43. [43]

    Journal of Chemical Theory and Computation , year=

    Descriptor-Free Collective Variables from Geometric Graph Neural Networks , author=. Journal of Chemical Theory and Computation , year=

  44. [44]

    Representation Learning with Contrastive Predictive Coding , journal =

    A. Representation Learning with Contrastive Predictive Coding , journal =. 2018 , url =

  45. [45]

    Rubenstein and Sylvain Gelly and Mario Lucic , title =

    Michael Tschannen and Josip Djolonga and Paul K. Rubenstein and Sylvain Gelly and Mario Lucic , title =. Proceedings of the 8th International Conference on Learning Representations (ICLR) , year =

  46. [46]

    Breakthroughs in statistics: Foundations and basic theory , pages=

    A class of statistics with asymptotically normal distribution , author=. Breakthroughs in statistics: Foundations and basic theory , pages=. 1992 , publisher=

  47. [47]

    The Journal of chemical physics , volume=

    A variational conformational dynamics approach to the selection of collective variables in metadynamics , author=. The Journal of chemical physics , volume=. 2017 , publisher=

  48. [48]

    Nature communications , volume=

    VAMPnets for deep learning of molecular kinetics , author=. Nature communications , volume=. 2018 , publisher=

  49. [49]

    Journal of Nonlinear Science , volume=

    Variational approach for learning Markov processes from time series data , author=. Journal of Nonlinear Science , volume=. 2020 , publisher=

  50. [50]

    Journal of chemical theory and computation , volume=

    Refining collective coordinates and improving free energy representation in variational enhanced sampling , author=. Journal of chemical theory and computation , volume=. 2018 , publisher=

  51. [51]

    Science , volume=

    How fast-folding proteins fold , author=. Science , volume=. 2011 , publisher=

  52. [52]

    Journal of computer-aided molecular design , volume=

    Overview of the SAMPL5 host--guest challenge: Are we doing better? , author=. Journal of computer-aided molecular design , volume=. 2017 , publisher=

  53. [53]

    Nature Communications , volume=

    The role of water in host-guest interaction , author=. Nature Communications , volume=. 2021 , publisher=

  54. [54]

    Applebaum, David , year=. L

  55. [55]

    SoftwareX , volume=

    GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , author=. SoftwareX , volume=. 2015 , publisher=

  56. [56]

    Computer physics communications , volume=

    PLUMED 2: New feathers for an old bird , author=. Computer physics communications , volume=. 2014 , publisher=

  57. [57]

    2019 , publisher=

    Promoting transparency and reproducibility in enhanced molecular simulations , journal=. 2019 , publisher=

  58. [58]

    Wiley Interdisciplinary Reviews: Computational Molecular Science , volume=

    An overview of the Amber biomolecular simulation package , author=. Wiley Interdisciplinary Reviews: Computational Molecular Science , volume=. 2013 , publisher=

  59. [59]

    Biophysical journal , volume=

    How robust are protein folding simulations with respect to force field parameterization? , author=. Biophysical journal , volume=. 2011 , publisher=

  60. [60]

    The journal of physical chemistry B , volume=

    All-atom empirical potential for molecular modeling and dynamics studies of proteins , author=. The journal of physical chemistry B , volume=. 1998 , publisher=

  61. [61]

    Journal of computational chemistry , volume=

    Development and testing of a general amber force field , author=. Journal of computational chemistry , volume=. 2004 , publisher=

  62. [62]

    The Journal of chemical physics , volume=

    Comparison of simple potential functions for simulating liquid water , author=. The Journal of chemical physics , volume=. 1983 , publisher=

  63. [63]

    The Journal of chemical physics , volume=

    Canonical sampling through velocity rescaling , author=. The Journal of chemical physics , volume=. 2007 , publisher=

  64. [64]

    Proceedings of the National Academy of Sciences , volume=

    Funnel metadynamics as accurate binding free-energy method , author=. Proceedings of the National Academy of Sciences , volume=. 2013 , publisher=

  65. [65]

    Journal of computer-aided molecular design , volume=

    Resolving the problem of trapped water in binding cavities: prediction of host--guest binding free energies in the SAMPL5 challenge by funnel metadynamics , author=. Journal of computer-aided molecular design , volume=. 2017 , publisher=

  66. [66]

    The Journal of chemical physics , volume=

    A local fingerprint for hydrophobicity and hydrophilicity: From methane to peptides , author=. The Journal of chemical physics , volume=. 2019 , publisher=

  67. [67]

    Physical Review E—Statistical, Nonlinear, and Soft Matter Physics , volume=

    Accurate sampling using Langevin dynamics , author=. Physical Review E—Statistical, Nonlinear, and Soft Matter Physics , volume=. 2007 , publisher=

  68. [68]

    Nature Computational Science , year =

    Megías, Alberto and Contreras Arredondo, Sergio and Chen, Cheng Giuseppe and Tang, Chenyu and Roux, Benoît and Chipot, Christophe , title =. Nature Computational Science , year =

  69. [69]

    The Journal of Chemical Physics , volume=

    State predictive information bottleneck , author=. The Journal of Chemical Physics , volume=. 2021 , publisher=

  70. [70]

    Annual Review of Physical Chemistry , volume=

    Enhanced sampling with machine learning , author=. Annual Review of Physical Chemistry , volume=. 2024 , publisher=

  71. [71]

    Physical review letters , volume=

    Silicon liquid structure and crystal nucleation from ab initio deep metadynamics , author=. Physical review letters , volume=. 2018 , publisher=

  72. [72]

    Proceedings of the National Academy of Sciences , volume=

    Molecular dynamics simulations of liquid silica crystallization , author=. Proceedings of the National Academy of Sciences , volume=. 2018 , publisher=

  73. [73]

    Journal of Chemical Theory and Computation , volume =

    Novelli, Pietro and Bonati, Luigi and Pontil, Massimiliano and Parrinello, Michele , title =. Journal of Chemical Theory and Computation , volume =. 2022 , note =

  74. [74]

    Chemical Reviews , year=

    Enhanced Sampling in the Age of Machine Learning: Algorithms and Applications , author=. Chemical Reviews , year=

  75. [75]

    Journal of computational physics , volume=

    Fast parallel algorithms for short-range molecular dynamics , author=. Journal of computational physics , volume=. 1995 , publisher=

  76. [76]

    The Journal of chemical physics , volume=

    Solid-liquid interface free energies of pure bcc metals and B2 phases , author=. The Journal of chemical physics , volume=. 2015 , publisher=

  77. [77]

    Journal of Applied physics , volume=

    Polymorphic transitions in single crystals: A new molecular dynamics method , author=. Journal of Applied physics , volume=. 1981 , publisher=

  78. [78]

    Journal of Chemical Theory and Computation , volume=

    Machine learning nucleation collective variables with graph neural networks , author=. Journal of Chemical Theory and Computation , volume=. 2023 , publisher=

  79. [79]

    The Journal of Physical Chemistry B , volume=

    Enhanced sampling of crystal nucleation with graph representation learnt variables , author=. The Journal of Physical Chemistry B , volume=. 2024 , publisher=

  80. [80]

    Digital Discovery , volume=

    A graph neural network-state predictive information bottleneck (GNN-SPIB) approach for learning molecular thermodynamics and kinetics , author=. Digital Discovery , volume=. 2025 , publisher=

Showing first 80 references.