pith. sign in

arxiv: 2605.18883 · v1 · pith:VGFZXNLPnew · submitted 2026-05-16 · 💻 cs.LG · cs.AI

Prediction Is Not Physics: Learning and Evaluating Conserved Quantities in Neural Simulators

Pith reviewed 2026-05-20 15:56 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords conserved quantitiesneural simulatorsHamiltonian systemsenergy conservationtrajectory predictiondiffusion modelsconservation discovery networktemporal consistency
0
0 comments X

The pith

Neural networks can predict physical trajectories accurately without learning the true conserved energy unless given explicit alignment to analytical values.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether neural networks trained on trajectories from Hamiltonian systems can discover globally conserved quantities such as energy. It tests structured kinetic-plus-potential models, black-box Conservation Discovery Networks, polynomial variants, and diffusion baselines on projectile motion, pendulum, and spring-mass systems. The structured model recovers analytical energy with R squared at or above 0.9999 on clean data, while black-box networks reach R squared at or above 0.996 only when a temporal consistency loss is paired with a small alignment term to analytical energy at the initial time. Removing the alignment term causes correlation to drop below 0.001 on the pendulum and spring-mass cases, showing that good rollout accuracy alone does not produce physical conservation. Under modest noise the black-box approach sometimes proves more robust than the structured one, though polynomial results vary strongly with training length and data volume.

Core claim

A diffusion model achieves rollout MSE near 10 to the minus 3 on Hamiltonian trajectories yet produces energy standard deviation 7500 to 36000 times larger than ground truth. This gap leads to the question of whether networks can learn or select conserved quantities. The structured T of v plus V of q model matches analytical energy to R squared greater than or equal to 0.9999 on clean data. The black-box CDN reaches R squared greater than or equal to 0.996 only with temporal consistency plus an alignment loss of lambda equal to 0.2 to analytical energy at t equals 0; with lambda equal to 0 the Pearson R squared collapses below 10 to the minus 3 on pendulum and spring-mass. Under 1 percent 1D

What carries the argument

The Conservation Discovery Network (CDN), a neural model that learns a scalar conserved quantity from position-velocity trajectories by optimizing a temporal consistency loss, optionally augmented by alignment to analytical energy at the first timestep.

If this is right

  • Structured kinetic-plus-potential models recover analytical energy to R squared above 0.9999 on clean Hamiltonian trajectories.
  • Black-box CDN performance collapses without the alignment term, indicating temporal consistency alone does not reliably identify the true conserved quantity.
  • Under 1 percent additive noise the black-box CDN can outperform the structured model on projectile and spring-mass systems.
  • Polynomial CDN variants reach R squared of 0.9998 given longer training and more data regardless of noise level.
  • Low rollout mean squared error in neural simulators does not imply preservation of physical invariants such as energy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hybrid training that combines rollout prediction with conservation discovery could reduce long-term drift in learned simulators even when full analytical expressions are unavailable.
  • The observed sensitivity to alignment loss suggests that discovery methods may need adaptation when conserved quantities are unknown or when data contain unknown external influences.
  • Applying similar consistency-plus-alignment objectives to systems with multiple invariants or mild dissipation could clarify the boundaries of what black-box networks can extract from trajectories.
  • If conserved-quantity learning improves stability, it might serve as an auxiliary objective for training predictive models on real sensor data from physical experiments.

Load-bearing premise

The three chosen systems are exactly Hamiltonian with no hidden dissipation or external forces, so the analytical energy is the unique globally conserved quantity.

What would settle it

Retraining the black-box CDN with only the temporal consistency loss on larger datasets or new Hamiltonian systems and checking whether it recovers analytical energy with R squared above 0.99 would test whether alignment is required.

Figures

Figures reproduced from arXiv: 2605.18883 by Aditya Kothari, Andrew Bukowski, Ishir Rao, Simba Shi.

Figure 1
Figure 1. Figure 1: Model schematic. The black-box CDN is an MLP with four hidden Linear–SiLU blocks of hidden dimension 256, mapping s ∈ R D to a scalar invariant f(s) ∈ R with no imposed physical structure. It is trained on min-max-normalized states using the temporal consistency loss and variance-hinge regularizer in Equation 1. We evaluate two variants: CDN-Conservation, which uses only the conservation objective, and CDN… view at source ↗
read the original abstract

A diffusion model trained on Hamiltonian trajectories can achieve rollout MSE near $10^{-3}$, but the standard deviation of its energy over time is between 7500 and 36000 times larger than the ground-truth energy standard deviation, indicating a failure to preserve conservation laws. This gap motivates our central question of whether neural networks can learn or select globally conserved quantities from physical trajectories. We investigate this across three Hamiltonian systems: projectile motion, pendulum, and spring-mass. We use a structured $T(v)+V(q)$ energy model, a black-box Conservation Discovery Network (CDN), a polynomial CDN, and a conditional diffusion baseline. The structured network reaches $R^2 \geq 0.9999$ against analytical energy on clean data, while the black-box CDN reaches $R^2 \geq 0.996$ when trained with temporal consistency plus a small alignment loss to analytical energy at $t=0$ ($\lambda_{\mathrm{align}}=0.2$). With $\lambda_{\mathrm{align}}=0$, CDN Pearson $R^2$ collapses on pendulum and spring-mass ($< 10^{-3}$), showing that temporal consistency alone is not enough to reliably identify the true energy. Under $1\%$ additive Gaussian noise, the CDN outperforms the structured model on the projectile and spring-mass systems, suggesting that the CDN may be more robust to noisy inputs in this setting. However, the polynomial CDN is sensitive to training configuration: it achieves $R^2=0.78$ under a short training schedule on the pendulum system, but reaches $R^2=0.9998$ with more training time and data, regardless of whether noise is added.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper shows that diffusion models trained on Hamiltonian trajectories achieve rollout MSE near 10^{-3} but produce energy trajectories whose standard deviation is 7500–36000 times larger than the ground-truth energy standard deviation. It then compares four approaches—structured T(v)+V(q) energy model, black-box Conservation Discovery Network (CDN), polynomial CDN, and conditional diffusion baseline—on projectile motion, pendulum, and spring-mass systems. The structured model reaches R² ≥ 0.9999 against analytical energy; the black-box CDN reaches R² ≥ 0.996 only when a small alignment loss to analytical energy at t=0 (λ_align=0.2) is added to temporal consistency, collapsing to R² < 10^{-3} without it; under 1% additive Gaussian noise the CDN outperforms the structured model on two systems.

Significance. If the central empirical findings hold, the work supplies concrete evidence that low prediction error in neural simulators does not imply preservation of conserved quantities and that purely unsupervised temporal-consistency objectives are insufficient to recover the physically relevant invariant on these systems. The explicit comparison of structured, black-box, and polynomial architectures, together with the noise-robustness results, offers a useful benchmark for future work on physically consistent neural simulators.

major comments (3)
  1. [Abstract and results on projectile motion] Abstract and §4 (results on projectile motion): the claim that temporal consistency alone fails to identify the true energy rests on the assumption that analytical energy is the unique globally conserved scalar recoverable from trajectories. Projectile motion also conserves horizontal momentum; a black-box CDN minimizing per-trajectory variance could recover a momentum-like function, producing low Pearson R² specifically with energy. The manuscript should report the variance of the learned CDN output and its correlation with both energy and momentum on this system.
  2. [Abstract and methods] Abstract and methods (loss formulations): the positive CDN result (R² ≥ 0.996) is obtained only with λ_align=0.2; with λ_align=0 the method collapses on pendulum and spring-mass. Because the successful regime therefore depends on supervision from the analytical energy at t=0, the manuscript should clarify whether the central claim is that conserved quantities can be discovered from trajectories alone or that a modest amount of analytical supervision is required.
  3. [Abstract] Abstract (noise-robustness paragraph): the statement that CDN outperforms the structured model under 1% additive Gaussian noise on projectile and spring-mass lacks the exact loss formulations, training schedules, and statistical significance tests used for that comparison. Given that the polynomial CDN is shown to be sensitive to training configuration, these details are load-bearing for the robustness claim.
minor comments (2)
  1. [Abstract] The precise numerical ranges for energy standard-deviation ratios (7500–36000) should be broken down by system rather than reported as a single interval.
  2. [Methods] Notation for the alignment loss term and the exact value chosen for λ_align should be defined in the main text before the results are presented.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the scope of our claims about unsupervised conservation discovery and the robustness of our empirical comparisons. We address each major point below, proposing targeted revisions to the manuscript where the suggestions strengthen the presentation without altering the core findings.

read point-by-point responses
  1. Referee: [Abstract and results on projectile motion] Abstract and §4 (results on projectile motion): the claim that temporal consistency alone fails to identify the true energy rests on the assumption that analytical energy is the unique globally conserved scalar recoverable from trajectories. Projectile motion also conserves horizontal momentum; a black-box CDN minimizing per-trajectory variance could recover a momentum-like function, producing low Pearson R² specifically with energy. The manuscript should report the variance of the learned CDN output and its correlation with both energy and momentum on this system.

    Authors: We agree that projectile motion conserves horizontal momentum in addition to total energy, and that a variance-minimizing black-box CDN could in principle recover a momentum-like scalar rather than energy. This is a valid point that qualifies our interpretation of the low R² values on this system. We will add the requested analysis: for the projectile-motion experiments we will report the temporal variance of the learned CDN output together with its Pearson correlations against both the analytical energy and the horizontal momentum. These results will be included in a revised §4 (and referenced from the abstract if they materially affect the summary claims). revision: yes

  2. Referee: [Abstract and methods] Abstract and methods (loss formulations): the positive CDN result (R² ≥ 0.996) is obtained only with λ_align=0.2; with λ_align=0 the method collapses on pendulum and spring-mass. Because the successful regime therefore depends on supervision from the analytical energy at t=0, the manuscript should clarify whether the central claim is that conserved quantities can be discovered from trajectories alone or that a modest amount of analytical supervision is required.

    Authors: We accept the observation. The manuscript already states that R² collapses to < 10^{-3} when λ_align=0 on pendulum and spring-mass, but the abstract and methods sections do not sufficiently foreground that the reported high-R² regime uses a small supervised alignment term. In the revision we will (i) explicitly label the λ_align=0.2 setting as “temporal consistency plus modest alignment supervision” throughout the abstract and §3, and (ii) rephrase the central claim to emphasize that purely unsupervised temporal consistency is insufficient to recover the physically relevant invariant on these systems, while a modest amount of supervision at a single time point enables recovery. This does not change the empirical result but makes the scope of the claim precise. revision: yes

  3. Referee: [Abstract] Abstract (noise-robustness paragraph): the statement that CDN outperforms the structured model under 1% additive Gaussian noise on projectile and spring-mass lacks the exact loss formulations, training schedules, and statistical significance tests used for that comparison. Given that the polynomial CDN is shown to be sensitive to training configuration, these details are load-bearing for the robustness claim.

    Authors: We agree that the noise-robustness paragraph in the abstract is too terse. In the revised manuscript we will expand the corresponding paragraph in §4 (and the abstract summary) to include: the precise loss weights used for both models under noise, the training schedule and data volume, the optimizer settings, and the number of random seeds together with any statistical significance tests performed. We already note the sensitivity of the polynomial CDN to training configuration; the added details will make the CDN-versus-structured comparison reproducible and will qualify the robustness statement accordingly. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical evaluation chain

full rationale

The paper reports direct experimental comparisons of learned functions against independently known analytical energies for three Hamiltonian systems, using explicit loss terms (temporal consistency and optional alignment) whose effects are ablated and quantified via R². No derivation reduces a claimed prediction to a fitted input by construction, no uniqueness theorem is imported from self-citation, and the central observation—that λ_align=0 yields collapse in correlation—is a measured outcome rather than a definitional equivalence. The evaluation benchmark (analytical energy) is external to the training procedure when λ_align=0, rendering the reported failure of pure temporal consistency a self-contained empirical result.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The paper relies on the assumption that the tested systems are purely Hamiltonian and that the analytical energy is the target conserved quantity; it introduces the CDN architecture and the alignment loss weight as a tunable parameter.

free parameters (1)
  • lambda_align = 0.2
    Weight of the alignment loss to analytical energy at t=0; set to 0.2 for the successful CDN runs.
axioms (1)
  • domain assumption The projectile, pendulum, and spring-mass systems obey Hamiltonian dynamics with a single globally conserved energy function.
    Invoked when treating analytical energy as ground truth for alignment and R² evaluation.
invented entities (1)
  • Conservation Discovery Network (CDN) no independent evidence
    purpose: Black-box network that outputs a scalar conserved quantity from state trajectories.
    New model architecture introduced and compared to structured and polynomial variants.

pith-pipeline@v0.9.0 · 5848 in / 1480 out tokens · 43929 ms · 2026-05-20T15:56:48.603574+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 1 internal anchor

  1. [1]

    [Brunton et al.(2016)] S. L. Brunton, J. L. Proctor, and J. N. Kutz. Discovering governing equations from data by sparse identification of nonlinear dynamical systems.Proceedings of the National Academy of Sciences, 113(15):3932–3937,

  2. [2]

    Champion, B

    [Champion et al.(2019)] K. Champion, B. Lusch, J. N. Kutz, and S. L. Brunton. Data-driven discovery of coordinates and governing equations.Proceedings of the National Academy of Sciences, 116(45):22445– 22451,

  3. [3]

    [Chen et al.(2018)] R. T. Q. Chen, Y . Rubanova, J. Bettencourt, and D. K. Duvenaud. Neural ordinary differen- tial equations.Advances in Neural Information Processing Systems, 31,

  4. [4]

    Cranmer, S

    [Cranmer et al.(2020)] M. Cranmer, S. Greydanus, S. Hoyer, P. Battaglia, D. Spergel, and S. Ho. Lagrangian neural networks.ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations,

  5. [5]

    [Cranmer(2023)] M. Cranmer. Interpretable machine learning for science with PySR and SymbolicRegression.jl. arXiv preprint arXiv:2305.01582,

  6. [6]

    [Du et al.(2023)] Y . Du, C. Durkan, R. Strudel, J. B. Tenenbaum, S. Dieleman, R. Fergus, J. Sohl-Dickstein, A. Doucet, and W. S. Grathwohl. Reduce, reuse, recycle: Compositional generation with energy-based diffusion models and MCMC.International Conference on Machine Learning,

  7. [7]

    Greydanus, M

    [Greydanus et al.(2019)] S. Greydanus, M. Dzamba, and J. Yosinski. Hamiltonian neural networks.Advances in Neural Information Processing Systems, 32,

  8. [8]

    [Ho et al.(2020)] J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851,

  9. [9]

    Liu and M

    9 [Liu and Tegmark(2021)] Z. Liu and M. Tegmark. Machine learning conservation laws from trajectories. Physical Review Letters, 126(18):180604,

  10. [10]

    Schmidt and H

    [Schmidt and Lipson(2009)] M. Schmidt and H. Lipson. Distilling free-form natural laws from experimental data.Science, 324(5923):81–85,

  11. [11]

    [Song et al.(2021)] Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations.International Conference on Learning Representations,

  12. [12]

    Udrescu and M

    [Udrescu and Tegmark(2020)] S.-M. Udrescu and M. Tegmark. AI Feynman: A physics-inspired method for symbolic regression.Science Advances, 6(16):eaay2631,