pith. sign in

arxiv: 2605.16682 · v1 · pith:4VZ6DEV5new · submitted 2026-05-15 · 💻 cs.LG

Identify Then Project: Contrastive Learning of Latent Dynamics from Partial Observations with Port-Hamiltonian Structure

Pith reviewed 2026-05-20 19:12 UTC · model grok-4.3

classification 💻 cs.LG
keywords contrastive learningport-Hamiltonianlatent dynamicspartial observationsphysics-informedrepresentation learningdynamical systems
0
0 comments X

The pith

A two-stage identify-then-project framework learns reliable latent port-Hamiltonian dynamics from partial observations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish a reliable way to learn latent dynamics that respect port-Hamiltonian structure when only partial and high-dimensional observations are available. It does so by first training a contrastive teacher to identify continuous-time latent dynamics and then having a student project those onto the structured submanifold using an affine chart. This separation is motivated by the finding that joint learning of identification and structure is less stable. A reader would care if the goal is to obtain models that are both data-driven and physically consistent for systems like mechanical or energy-based ones.

Core claim

We show theoretically that affine projection is the natural bridge between the affine gauge of contrastive latent identification and the port-Hamiltonian systems. Empirically, the two-stage approach preserves the teacher's dynamics while enforcing physical structure and performs more reliably than the single-stage alternative, particularly in dissipative regimes and high-dimensional visual settings.

What carries the argument

Affine projection via a learned affine chart that maps identified latent dynamics onto a port-Hamiltonian submanifold.

Load-bearing premise

The contrastive teacher successfully learns accurate continuous-time latent dynamics from partial observations before the projection step is applied; if the initial identification is poor, the subsequent affine projection cannot recover a faithful port-Hamiltonian realization.

What would settle it

Observing whether the two-stage model fails to match ground-truth trajectories when the contrastive teacher's identification accuracy is low, as measured by prediction error before projection.

Figures

Figures reproduced from arXiv: 2605.16682 by Daniel Moyer, Kaiyuan Tan, Peilun Li, Thomas Beckers.

Figure 1
Figure 1. Figure 1: From partial observations, a windowed state encoder infers a latent teacher state [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Q1: Identification from partial observations on Duffing. We compare the ground-truth phase portrait with the teacher and direct one-stage CL+PHNN in both numeric and video settings. The rightmost panel reports observation-conditioned AUC-R2 id,1:20 and the rollout-R2 curve after a single train-split affine alignment to physical coordinates. On this harder dissipative system, the teacher consistently identi… view at source ↗
Figure 3
Figure 3. Figure 3: Q2: Projection preserves the identified teacher dynamics. Dynamics representations are shown for pendulum video and Duffing numeric, together with the projection score AUC-R2 proj,1:20. The student stays close to the teacher’s identified flow on both systems, showing that projection onto the port-Hamiltonian submanifold preserves most of the identified short-horizon dynamics. q (position) 2 0 2 p (momentum… view at source ↗
Figure 4
Figure 4. Figure 4: Q3: Physics-informed projection through the learned Hamiltonian. The student’s learned Hamiltonian is visualized after affine alignment to physical coordinates. Top row: on the conservative pendulum, the teacher trajectory violates physical principles by gaining energy over time, whereas the student remains on a near-constant energy level set. Bottom row: on dissipative Duffing, the student learns a descen… view at source ↗
Figure 5
Figure 5. Figure 5: This figure showcases the video frames we used for video modality. The ground truth [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visualization for Our framework variants on Pendulum dynamics identification (Q1) This figure contains the state representation learned from Numeric (top section), and from video (bottom section). Ground truth phase diagram is shown on the left. Both Student and CL+pHNN models are physics-consistent, while teacher model is not. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Visualization for Our framework variants on Duffing dynamics identification (Q1) This figure contains the state representation learned from Numeric (top section), and from video (bottom section). Ground truth phase diagram is shown on the left. Both Student and CL+pHNN models are physics-consistent, while teacher model is not. 2.5 0.0 2.5 6 4 2 0 2 4 6 (ang. velocity) Ground Truth 2.5 0.0 2.5 DCL Numeric (… view at source ↗
Figure 8
Figure 8. Figure 8: Visualization for DCL on Pendulum and Duffing dynamics identification (Q1) This figure contains the state representation learned from Numeric (mid column), and from video (right column). Ground truth phase diagram is shown on the left. Under partial observation, DCL does not learn useful representations of the dynamics. This is expected because partial observations (non-injective) are violation of DCL’s th… view at source ↗
Figure 9
Figure 9. Figure 9: Visualization for DCL* on Pendulum and Duffing dynamics identification (Q1) DCL* is the DCL with switching linear dynamics trained on fully observable data, and only applicable case is numeric dataset for pendulum and duffing. This figure contains the state representation learned from Pendulum Numeric (first row, second column), and from Duffing numeric (second row, second column). Ground truth phase diagr… view at source ↗
Figure 10
Figure 10. Figure 10: Visualization for Mamba-3 on Pendulum and Duffing dynamics identification (Q1) This figure contains the state representation learned from Numeric (mid column), and from video (right column). Ground truth phase diagram is shown on the left. Mamba-3 performs extreme well for numeric datasets because reconstruction of position state is used in its loss objective. However, when state information is not explic… view at source ↗
read the original abstract

Identifying latent state representations and dynamics is essential when direct modeling in observation space is infeasible, particularly under partial and high-dimensional observations. In such settings, representation learning and physics-aware modeling are inherently coupled. We study this problem for latent port-Hamiltonian systems, a structured class encompassing both conservative and dissipative dynamics. We propose a two-stage identify-then-project framework. First, a contrastive teacher learns continuous-time latent dynamics from partial observations. Then, a student projects the identified teacher representation and dynamics onto a port-Hamiltonian submanifold via a learned affine chart, yielding a physically consistent realization. As a conceptual counterfactual, we also consider a single-stage variant that jointly learns latent identification and port-Hamiltonian structure, but find it to be less reliable, motivating the proposed two-stage teacher-student framework. We show theoretically that affine projection is the natural bridge between the affine gauge of contrastive latent identification and the port-Hamiltonian systems. Empirically, we demonstrate that the proposed two-stage approach preserves the teacher's dynamics while enforcing physical structure, and performs more reliably than the single-stage alternative, particularly in dissipative regimes and high-dimensional visual settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a two-stage 'identify-then-project' framework for learning latent port-Hamiltonian dynamics from partial and high-dimensional observations. A contrastive teacher first identifies continuous-time latent dynamics; a student then applies a learned affine projection to map the teacher's representation onto a port-Hamiltonian submanifold. The paper claims a theoretical result that affine projection naturally bridges the affine gauge freedom of contrastive latent identification with port-Hamiltonian structure, and presents empirical evidence that the two-stage pipeline preserves teacher dynamics while enforcing physical consistency and outperforms a single-stage joint-learning baseline, especially in dissipative regimes and visual settings.

Significance. If the theoretical bridge and the empirical reliability claims hold, the work would provide a principled separation between representation learning and structure enforcement that could improve generalization and physical consistency in latent dynamical models. The explicit handling of affine gauge freedom and the focus on port-Hamiltonian systems (covering both conservative and dissipative cases) address a relevant gap between contrastive learning and physics-informed modeling.

major comments (2)
  1. [Section 3] Section 3 (theoretical bridge): the claim that affine projection is the 'natural bridge' between the contrastive affine gauge and port-Hamiltonian structure presupposes that the teacher's identified latent flow is already accurate; the derivation does not quantify how approximation errors in the teacher's continuous-time dynamics propagate through the affine chart or whether the projection can recover missing dynamical information.
  2. [Empirical evaluation] Empirical evaluation (dissipative and high-dimensional regimes): the reported superiority of the two-stage approach over the single-stage counterfactual does not include controlled ablations in which the contrastive teacher is deliberately degraded (e.g., via reduced observation quality or increased dissipation); without such tests it remains unclear whether the projection step can compensate for imperfect initial identification, which is the central assumption highlighted in the pipeline.
minor comments (2)
  1. [Abstract] The abstract and introduction could more explicitly define the precise port-Hamiltonian structure (e.g., which matrices or functions are learned versus fixed) to clarify what 'physical consistency' is being enforced.
  2. [Notation] Notation for the affine chart and the projection operator should be introduced with a single consolidated table or diagram to reduce cross-referencing between the theoretical and algorithmic sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments raise important points about the assumptions in our theoretical analysis and the robustness of our empirical claims. We address each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [Section 3] Section 3 (theoretical bridge): the claim that affine projection is the 'natural bridge' between the contrastive affine gauge and port-Hamiltonian structure presupposes that the teacher's identified latent flow is already accurate; the derivation does not quantify how approximation errors in the teacher's continuous-time dynamics propagate through the affine chart or whether the projection can recover missing dynamical information.

    Authors: We agree that the theoretical derivation in Section 3 assumes an accurate latent flow from the teacher and establishes that affine projection respects the gauge freedom while embedding the dynamics into port-Hamiltonian form. The result does not claim that the projection recovers missing dynamical information from an imperfect teacher; it only shows structure enforcement on top of whatever representation the teacher provides. We do not quantify error propagation in the current analysis, which focuses on the exact case to highlight the conceptual bridge. In the revision we will add an explicit discussion of this assumption and its practical implications, including a brief remark on how teacher approximation errors would carry through the affine chart. revision: partial

  2. Referee: [Empirical evaluation] Empirical evaluation (dissipative and high-dimensional regimes): the reported superiority of the two-stage approach over the single-stage counterfactual does not include controlled ablations in which the contrastive teacher is deliberately degraded (e.g., via reduced observation quality or increased dissipation); without such tests it remains unclear whether the projection step can compensate for imperfect initial identification, which is the central assumption highlighted in the pipeline.

    Authors: We acknowledge that the current experiments do not include controlled degradations of the teacher, so they do not directly test the projection step's ability to compensate for poor initial identification. The reported results show that the two-stage pipeline preserves teacher dynamics while adding physical consistency and outperforms joint single-stage training, especially under dissipation. To address the referee's concern we will add new ablation experiments in the revision that deliberately degrade the teacher (e.g., by increasing observation noise or dissipation strength) and measure how well the subsequent projection recovers a consistent port-Hamiltonian model. These results will clarify the practical limits of the identify-then-project separation. revision: yes

Circularity Check

0 steps flagged

No circularity detected; derivation builds on independent contrastive learning and port-Hamiltonian foundations

full rationale

The paper's central theoretical claim—that affine projection serves as the natural bridge between the affine gauge of contrastive latent identification and port-Hamiltonian structure—is presented as a derived result in Section 3 rather than a redefinition of inputs. The two-stage identify-then-project pipeline treats the contrastive teacher's output as an independent starting point whose dynamics are then mapped via a learned affine chart; this mapping is not shown to be equivalent to the teacher's fit by construction. No self-citations are load-bearing for the uniqueness or correctness of the bridge, no fitted parameters are relabeled as predictions, and no ansatz is smuggled through prior author work. The framework remains self-contained against external benchmarks from contrastive representation learning and structured dynamical systems, with the single-stage counterfactual serving as an empirical comparison rather than a definitional necessity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions from contrastive representation learning and port-Hamiltonian systems theory; no new free parameters, axioms, or invented entities are explicitly introduced in the abstract beyond the proposed projection mechanism.

axioms (2)
  • domain assumption Contrastive learning can identify continuous-time latent dynamics from partial observations
    Invoked as the first stage of the teacher model in the abstract.
  • domain assumption Affine charts provide a natural mapping to port-Hamiltonian submanifolds
    Stated as the theoretical bridge in the abstract.

pith-pipeline@v0.9.0 · 5743 in / 1362 out tokens · 44009 ms · 2026-05-20T19:12:10.747553+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · 7 internal anchors

  1. [1]

    Phast: Port-hamiltonian architecture for structured temporal dynamics forecasting.arXiv preprint arXiv:2602.17998, 2026

    Shubham Bhardwaj and Chandrajit Bajaj. Phast: Port-hamiltonian architecture for structured temporal dynamics forecasting.arXiv preprint arXiv:2602.17998, 2026

  2. [2]

    Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differential equations, 2019. URLhttps://arxiv.org/abs/1806.07366

  3. [3]

    A Simple Framework for Contrastive Learning of Visual Representations

    Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations, 2020. URL https://arxiv.org/abs/ 2002.05709

  4. [4]

    Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

    Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling, 2014. URL https://arxiv.org/ abs/1412.3555

  5. [5]

    Group equivariant convolutional networks

    Taco Cohen and Max Welling. Group equivariant convolutional networks. In Maria Florina Balcan and Kilian Q. Weinberger, editors,Proceedings of The 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, pages 2990–2999, New York, New York, USA, 20–22 Jun 2016. PMLR. URL https://proceedings.mlr. press/v48/co...

  6. [6]

    Desai, Marios Mattheakis, David Sondak, Pavlos Protopapas, and Stephen J

    Shaan A. Desai, Marios Mattheakis, David Sondak, Pavlos Protopapas, and Stephen J. Roberts. Port-hamiltonian neural networks for learning explicit time-dependent dynamical systems. Physical Review E, 104(3):034312, 2021

  7. [7]

    Safe physics-informed machine learning for dynamics and control

    Ján Drgoˇna, Truong X Nghiem, Thomas Beckers, Mahyar Fazlyab, Enrique Mallada, Colin Jones, Draguna Vrabie, Steven L Brunton, and Rolf Findeisen. Safe physics-informed machine learning for dynamics and control. In2025 American Control Conference (ACC), pages 591–606. IEEE, 2025

  8. [8]

    Self-supervised contrastive learning performs non-linear system identification

    Rodrigo González Laiz, Tobias Schmidt, and Steffen Schneider. Self-supervised contrastive learning performs non-linear system identification. InThe Thirteenth International Con- ference on Learning Representations, 2025. URL https://openreview.net/forum?id= ONfWFluZBI

  9. [9]

    Mamba: Linear-time sequence modeling with selective state spaces,

    Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces,

  10. [10]

    URLhttps://arxiv.org/abs/2312.00752

  11. [11]

    Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021

    George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021

  12. [12]

    Deep Kalman Filters

    Rahul G. Krishnan, Uri Shalit, and David Sontag. Deep kalman filters.arXiv preprint arXiv:1511.05121, 2015

  13. [13]

    Learning latent graph dynamics for visual manipulation of deformable objects

    Xiao Ma, David Hsu, and Wee Sun Lee. Learning latent graph dynamics for visual manipulation of deformable objects. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2022

  14. [14]

    Raissi, P

    M. Raissi, P. Perdikaris, and G.E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378:686–707, 2019. ISSN 0021-9991. doi: https://doi.org/10.1016/j.jcp.2018.10.045. URL https://www.sciencedirect.com/ scien...

  15. [15]

    Data-driven identification of latent port-hamiltonian systems.Computational Science and Engineering, 2(1):4, 2025

    Johannes Rettberg, Jonas Kneifl, Julius Herb, Patrick Buchfink, Jörg Fehr, and Bernard Haas- donk. Data-driven identification of latent port-hamiltonian systems.Computational Science and Engineering, 2(1):4, 2025

  16. [16]

    Yulia Rubanova, Ricky T. Q. Chen, and David Duvenaud. Latent ODEs for irregularly-sampled time series. InAdvances in Neural Information Processing Systems, volume 32, 2019

  17. [17]

    Contrast all the time: Learning time series representation from temporal consistency, 2025

    Abdul-Kazeem Shamba, Kerstin Bach, and Gavin Taylor. Contrast all the time: Learning time series representation from temporal consistency, 2025. URL https://arxiv.org/abs/2410. 15416. 12

  18. [18]

    Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

    Antti Tarvainen and Harri Valpola. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, 2018. URL https:// arxiv.org/abs/1703.01780

  19. [19]

    Hamiltonian Generative Networks

    Peter Toth, Danilo Jimenez Rezende, Andrew Jaegle, Sébastien Racanière, Aleksandar Botev, and Irina Higgins. Hamiltonian generative networks.arXiv preprint arXiv:1909.13789, 2020

  20. [20]

    Representation Learning with Contrastive Predictive Coding

    Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748, 2018

  21. [21]

    Port-hamiltonian systems theory: An introductory overview.Foundations and Trends® in Systems and Control, 1(2-3):173–378, 2014

    Arjan Van Der Schaft and Dimitri Jeltsema. Port-hamiltonian systems theory: An introductory overview.Foundations and Trends® in Systems and Control, 1(2-3):173–378, 2014. 13 A Appendix A.1 Theory details and proofs This part of appendix gives the formal statements behind Section 4. The main text states the affine DCL–pH bridge in a compact form; here we s...

  22. [22]

    The contrastive objective attains its population/global optimum in the infinite-data limit. The implemented teacher uses the same windowed encoder/dynamics structure and negative-squared score, but finite data, finite-capacity optimization, omitted explicit correction terms, and deterministic experiments are not claimed to satisfy the stochastic DCL theor...