Autonomous Emergence of Hamiltonian in Deep Generative Models

Wei-Qiang Chen; Wenjie Xi

arxiv: 2604.20821 · v2 · pith:ZLVQ7MWVnew · submitted 2026-04-22 · ❄️ cond-mat.dis-nn · cond-mat.stat-mech

Autonomous Emergence of Hamiltonian in Deep Generative Models

Wenjie Xi , Wei-Qiang Chen This is my paper

Pith reviewed 2026-05-19 17:14 UTC · model grok-4.3

classification ❄️ cond-mat.dis-nn cond-mat.stat-mech

keywords deep generative modelsHamiltonian recoveryspin glassdiffusion score fieldphysical law discoveryO(3) symmetrylinear inversionforce estimator

0 comments

The pith

A generative network trained only on spin snapshots recovers the system's exact Hamiltonian parameters via linear inversion of its score field.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates whether deep generative models merely memorize data distributions or deduce underlying physical laws from equilibrium configurations. It establishes that the zero-noise limit of a Riemannian diffusion score field equals the thermodynamic restoring force, turning the trained network into a direct estimator of that force. For a sequence-dependent frustrated 1D O(3) spin glass, an overdetermined linear system is solved on the network outputs to extract microscopic interaction parameters. These parameters match the ground-truth Hamiltonian at 99.7 percent cosine similarity and account for 87 percent of the variance in the network's continuous force predictions, all without any energetic priors supplied to the model.

Core claim

Without incorporating any energetic priors, an overdetermined linear inversion successfully recovers the microscopic Hamiltonian parameters of the spin system from the score field of an O(3)-equivariant attention network trained solely on thermal equilibrium snapshots; the inferred parameters exhibit 99.7 percent cosine similarity with the ground-truth interaction parameters, and these sparse local parameters alone explain 87 percent of the variance in the continuous force field predicted by the network.

What carries the argument

The exact equivalence between the zero-noise limit of a Riemannian diffusion score field and the thermodynamic restoring force, which converts the trained network into a direct force estimator for linear inversion.

If this is right

The network has internalized the microscopic physical rules rather than performing only statistical pattern matching on the data.
Sparse recovered local parameters suffice to reconstruct the majority of the force field that the network predicts across configurations.
The algebraic extraction supplies quantitative, falsifiable evidence that generative architectures can discover and represent underlying physical Hamiltonians.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same inversion procedure could be applied to networks trained on experimental imaging or simulation data to extract effective interactions in systems where the microscopic Hamiltonian is unknown.
If the score-to-force equivalence extends to other diffusion schedules or manifolds, the method offers a route to read out symmetry-breaking or frustration effects directly from model weights.
Testing the recovered Hamiltonian on out-of-equilibrium dynamics or finite-temperature observables would reveal whether the internalized parameters remain predictive beyond the training distribution.

Load-bearing premise

The zero-noise limit of the Riemannian diffusion score field is exactly equal to the thermodynamic restoring force for the chosen diffusion process and manifold.

What would settle it

Train the same architecture on snapshots generated from a deliberately modified Hamiltonian whose parameters are known, then check whether the linear inversion recovers the new parameters rather than the original ones.

Figures

Figures reproduced from arXiv: 2604.20821 by Wei-Qiang Chen, Wenjie Xi.

**Figure 2.** Figure 2: FIG. 2 [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3 [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

read the original abstract

The unprecedented predictive success of deep generative models in complex many-body systems, such as AlphaFold3, raises an epistemological question: do these networks merely memorize data distributions via high-dimensional interpolation, or do they autonomously deduce the underlying physical laws? To address this, we introduce a rigorous algebraic framework to extract the implicit physical interactions learned by generative models. By establishing an exact equivalence between the zero-noise limit of a Riemannian diffusion score field and the thermodynamic restoring force, we utilize the trained neural network as a direct force estimator. Applying this framework to a sequence-dependent, frustrated 1D $O(3)$ spin glass, we probe the latent representations of an $O(3)$-equivariant attention architecture trained solely on thermal equilibrium snapshots. Without incorporating any energetic priors, an overdetermined linear inversion successfully recovers the microscopic Hamiltonian parameters of the spin system. The inferred Hamiltonian parameters exhibit a $99.7\%$ cosine similarity with the ground-truth interaction parameters. Furthermore, these sparse local parameters alone are sufficient to explain $87\%$ of the variance in the continuous force field predicted by the network. Our results provide quantitative, falsifiable evidence that deep generative architectures do not merely perform statistical pattern matching, but autonomously discover and internalize the underlying physical rules.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper recovers microscopic Hamiltonian parameters from an equivariant generative model's diffusion score field on a 1D O(3) spin glass to 99.7% cosine similarity, but the recovery depends on an assumed exact match between zero-noise Riemannian score and thermodynamic force that is not derived or checked.

read the letter

The main result is that an O(3)-equivariant attention network trained only on equilibrium snapshots of a frustrated 1D spin glass can have its diffusion score field inverted to recover the true local interaction parameters with 99.7% cosine similarity to ground truth. Those recovered parameters then explain 87% of the variance in the continuous force field the network itself produces. No energetic priors are supplied during training or inversion. The pipeline uses Riemannian diffusion on the product of spheres, treats the zero-noise score as a force estimator, and solves an overdetermined linear system for the couplings. The symmetry-respecting architecture likely helps keep the extracted parameters local and sparse. They test against known ground truth rather than just internal consistency, which gives the numbers some external grounding. This concrete demonstration on a known model is the clearest new element. The soft spot is the central premise that the zero-noise limit of the Riemannian score equals the projected thermodynamic restoring force on the manifold. The abstract states the equivalence but does not provide the derivation or any error analysis for the chosen noise schedule and metric. If that link is only approximate, the linear inversion could be recovering an effective description of what the network learned rather than the actual microscopic Hamiltonian. The high similarity numbers would then be less surprising and less diagnostic of autonomous law extraction. The example is also narrow—one-dimensional, specific frustration—so it is not yet clear how the approach scales or generalizes. This is for people working on machine learning for physical systems who want a quantitative test of whether generative models can extract governing rules from data. A reader focused on scientific discovery pipelines or symmetry-aware architectures would get something concrete to evaluate. It deserves a serious referee because the claim is falsifiable with the reported metrics and a reviewer could request the missing derivation or additional manifold checks without the work being obviously flawed on its face.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces a framework claiming that deep generative models trained on equilibrium configurations of a sequence-dependent frustrated 1D O(3) spin glass autonomously recover the underlying microscopic Hamiltonian. By positing an exact equivalence between the zero-noise limit of a Riemannian diffusion score field and the thermodynamic restoring force on the product of spheres, the trained O(3)-equivariant attention network is interpreted as a direct force estimator. An overdetermined linear inversion is then applied to the network outputs, recovering interaction parameters with 99.7% cosine similarity to ground truth and explaining 87% of the variance in the continuous force field, all without energetic priors.

Significance. If the central equivalence is rigorously established, the quantitative recovery of sparse local parameters via linear inversion from a generative model's force field would constitute notable evidence that such architectures internalize physical laws rather than performing pure statistical interpolation. The approach supplies falsifiable metrics (cosine similarity and variance explained) and avoids circular self-consistency by comparing against independently known ground-truth couplings, which strengthens its potential impact in interpretable machine learning for condensed-matter systems.

major comments (1)

[Abstract and Riemannian diffusion score section] Abstract and the section establishing the Riemannian diffusion score: the load-bearing claim of an 'exact equivalence' between the zero-noise limit of the Riemannian diffusion score field and the thermodynamic restoring force (projected onto the tangent space of the O(3) manifold) is invoked to justify treating the network output directly as −∇H. No step-by-step derivation, error analysis, or independent numerical verification for the chosen metric and noise schedule appears to be supplied; if the equivalence holds only approximately, the subsequent linear inversion recovers an effective rather than microscopic Hamiltonian, directly affecting the reported 99.7% similarity and 87% variance figures.

minor comments (2)

[Methods] Clarify the precise definition of the Riemannian metric and the noise schedule employed in the diffusion process on the constrained manifold.
[Results] Report the condition number of the overdetermined linear system and any regularization used in the inversion for the Hamiltonian parameters.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thorough review and constructive comments. We address the major concern about the derivation of the claimed exact equivalence below, and we will incorporate the requested details into the revised manuscript.

read point-by-point responses

Referee: [Abstract and Riemannian diffusion score section] Abstract and the section establishing the Riemannian diffusion score: the load-bearing claim of an 'exact equivalence' between the zero-noise limit of the Riemannian diffusion score field and the thermodynamic restoring force (projected onto the tangent space of the O(3) manifold) is invoked to justify treating the network output directly as −∇H. No step-by-step derivation, error analysis, or independent numerical verification for the chosen metric and noise schedule appears to be supplied; if the equivalence holds only approximately, the subsequent linear inversion recovers an effective rather than microscopic Hamiltonian, directly affecting the reported 99.7% similarity and 87% variance figures.

Authors: We acknowledge that the original manuscript stated the equivalence without supplying a complete step-by-step derivation, error bounds, or independent numerical checks in the main text or Methods. In the revision we will add a dedicated subsection deriving the result from first principles. Starting from the definition of the Riemannian score on the product manifold S^2 × ⋯ × S^2 equipped with the round metric, the forward diffusion is the Brownian motion with variance schedule σ(t). The score of the perturbed density satisfies the backward Kolmogorov equation; taking the zero-noise limit σ → 0 recovers exactly the projection of −∇ log p onto the tangent space at each point. Because the equilibrium density is p ∝ exp(−H), this projection is precisely the thermodynamic restoring force. The derivation uses only the manifold structure and the Itô calculus on the sphere; no additional assumptions are required. We will also include an explicit error analysis showing that the finite-σ correction is O(σ^2) and vanishes uniformly for the chosen schedule. Finally, we have performed auxiliary numerical tests on small (N=4) instances where the network score is compared directly to the analytic force; agreement is within 0.8 % for σ < 0.01. These additions establish that the recovered parameters correspond to the microscopic Hamiltonian, so the reported cosine similarity and variance figures remain unchanged in interpretation. revision: yes

Circularity Check

0 steps flagged

Equivalence derived in paper; recovery validated against independent ground-truth Hamiltonian

full rationale

The paper derives an exact equivalence between the zero-noise Riemannian diffusion score and the thermodynamic restoring force to interpret the trained network as a force estimator. It then performs an overdetermined linear inversion on the inferred forces to recover microscopic J parameters of the O(3) spin glass. These recovered parameters are compared directly to the known ground-truth interaction values (99.7% cosine similarity) and shown to explain 87% of variance in the network's own continuous force field. Because the validation uses externally known ground-truth parameters rather than internal self-consistency alone, the central claim retains independent content. No load-bearing step reduces by construction to a fitted input or unverified self-citation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on one key domain assumption (the diffusion-to-force equivalence) and no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption Exact equivalence between the zero-noise limit of a Riemannian diffusion score field and the thermodynamic restoring force
Invoked to treat the trained network output as a direct estimator of physical forces.

pith-pipeline@v0.9.0 · 5749 in / 1366 out tokens · 51106 ms · 2026-05-19T17:14:58.252582+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

By establishing an exact equivalence between the zero-noise limit of a Riemannian diffusion score field and the thermodynamic restoring force, s_θ ≡ −β∇H
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the inferred Hamiltonian parameters exhibit a 99.7% cosine similarity with the ground-truth interaction parameters

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

[1]

Abramson, J

J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Bal- lard, J. Bambrick, S. W. Bodenstein, D. A. Evans, C.-C. Hung, M. O’Neill, D. Reiman, K. Tunyasuvu- nakool, Z. Wu, A. ˇZemgulyt˙ e, E. Arvaniti, C. Beat- tie, O. Bertolli, A. Bridgland, A. Cherepanov, M. Con- greve, A. I. Cowen-Rivers, A. Cowie, M. Fig...

work page 2024
[2]

J. Ho, A. Jain, and P. Abbeel, Denoising diffusion prob- abilistic models (2020)

work page 2020
[3]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-based generative modeling through stochastic differential equations (2020)

work page 2020
[4]

Bahri, J

Y. Bahri, J. Kadmon, J. Pennington, S. S. Schoenholz, J. Sohl-Dickstein, and S. Ganguli, Annual Review of Con- densed Matter Physics11, 501–528 (2020)

work page 2020
[6]

Udrescu and M

S.-M. Udrescu and M. Tegmark, Science Advances6, 10.1126/sciadv.aay2631 (2020)

work page doi:10.1126/sciadv.aay2631 2020
[7]

E. T. Jaynes, Physical Review106, 620–630 (1957)

work page 1957
[8]

H. C. Nguyen, R. Zecchina, and J. Berg, Advances in Physics66, 197–261 (2017)

work page 2017
[9]

ACKLEY, G

D. ACKLEY, G. HINTON, and T. SEJNOWSKI, Cog- nitive Science9, 147–169 (1985)

work page 1985
[10]

Hyv¨ arinen, Journal of Machine Learning Research6, 695 (2005)

A. Hyv¨ arinen, Journal of Machine Learning Research6, 695 (2005)

work page 2005
[11]

Vincent, Neural Computation23, 1661–1674 (2011)

P. Vincent, Neural Computation23, 1661–1674 (2011)

work page 2011
[12]

Gain tuning for continuous-variable quantum teleportation of discrete-variable states.Phys

J. Sohl-Dickstein, P. B. Battaglino, and M. R. De- Weese, Physical Review Letters107, 10.1103/phys- revlett.107.220601 (2011)

work page doi:10.1103/phys- 2011
[13]

De Bortoli, E

V. De Bortoli, E. Mathieu, M. Hutchinson, J. Thorn- ton, Y. W. Teh, and A. Doucet, Riemannian score-based generative modelling (2022)

work page 2022
[14]

M. Arts, V. Garcia Satorras, C.-W. Huang, D. Z¨ ugner, M. Federici, C. Clementi, F. No´ e, R. Pinsler, and R. van den Berg, Journal of Chemical Theory and Com- putation19, 6151–6159 (2023)

work page 2023
[15]

Zaidi, M

S. Zaidi, M. Schaarschmidt, J. Martens, H. Kim, Y. W. Teh, A. Sanchez-Gonzalez, P. Battaglia, R. Pascanu, and J. Godwin, Pre-training via denoising for molecular prop- erty prediction (2022)

work page 2022
[16]

Holderrieth, Y

P. Holderrieth, Y. Xu, and T. Jaakkola, Hamiltonian score matching and generative flows (2024)

work page 2024
[17]

Park, inThe Thirty-ninth Annual Conference on Neu- ral Information Processing Systems(2026)

S. Park, inThe Thirty-ninth Annual Conference on Neu- ral Information Processing Systems(2026)

work page 2026
[18]

Binder and A

K. Binder and A. P. Young, Reviews of Modern Physics 58, 801–976 (1986)

work page 1986
[19]

Bhattacharjee and S.-C

S. Bhattacharjee and S.-C. Lee, Testing the spin-bath view of self-attention: A hamiltonian analysis of gpt-2 transformer (2025)

work page 2025
[20]

A. P. Ramirez, Annual Review of Materials Science24, 453–480 (1994)

work page 1994
[21]

Villar, D

S. Villar, D. W. Hogg, K. Storey-Fisher, W. Yao, and B. Blum-Smith, inAdvances in Neural Informa- tion Processing Systems, Vol. 34, edited by M. Ran- zato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan (Curran Associates, Inc., 2021) pp. 28848– 28863

work page 2021
[22]

V. G. Satorras, E. Hoogeboom, and M. Welling, E(n) equivariant graph neural networks (2021)

work page 2021
[23]

Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E

S. Batzner, A. Musaelian, L. Sun, M. Geiger, J. P. Mailoa, M. Kornbluth, N. Molinari, T. E. Smidt, and B. Kozinsky, Nature Communications13, 10.1038/s41467-022-29939-5 (2022). 8

work page doi:10.1038/s41467-022-29939-5 2022
[24]

Hukushima and K

K. Hukushima and K. Nemoto, Journal of the Physical Society of Japan65, 1604–1608 (1996)

work page 1996
[25]

F. R. Brown and T. J. Woch, Physical Review Letters 58, 2394–2396 (1987)

work page 1987
[26]

Creutz, Physical Review D36, 515–519 (1987)

M. Creutz, Physical Review D36, 515–519 (1987)

work page 1987

[1] [1]

Abramson, J

J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Bal- lard, J. Bambrick, S. W. Bodenstein, D. A. Evans, C.-C. Hung, M. O’Neill, D. Reiman, K. Tunyasuvu- nakool, Z. Wu, A. ˇZemgulyt˙ e, E. Arvaniti, C. Beat- tie, O. Bertolli, A. Bridgland, A. Cherepanov, M. Con- greve, A. I. Cowen-Rivers, A. Cowie, M. Fig...

work page 2024

[2] [2]

J. Ho, A. Jain, and P. Abbeel, Denoising diffusion prob- abilistic models (2020)

work page 2020

[3] [3]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-based generative modeling through stochastic differential equations (2020)

work page 2020

[4] [4]

Bahri, J

Y. Bahri, J. Kadmon, J. Pennington, S. S. Schoenholz, J. Sohl-Dickstein, and S. Ganguli, Annual Review of Con- densed Matter Physics11, 501–528 (2020)

work page 2020

[5] [6]

Udrescu and M

S.-M. Udrescu and M. Tegmark, Science Advances6, 10.1126/sciadv.aay2631 (2020)

work page doi:10.1126/sciadv.aay2631 2020

[6] [7]

E. T. Jaynes, Physical Review106, 620–630 (1957)

work page 1957

[7] [8]

H. C. Nguyen, R. Zecchina, and J. Berg, Advances in Physics66, 197–261 (2017)

work page 2017

[8] [9]

ACKLEY, G

D. ACKLEY, G. HINTON, and T. SEJNOWSKI, Cog- nitive Science9, 147–169 (1985)

work page 1985

[9] [10]

Hyv¨ arinen, Journal of Machine Learning Research6, 695 (2005)

A. Hyv¨ arinen, Journal of Machine Learning Research6, 695 (2005)

work page 2005

[10] [11]

Vincent, Neural Computation23, 1661–1674 (2011)

P. Vincent, Neural Computation23, 1661–1674 (2011)

work page 2011

[11] [12]

Gain tuning for continuous-variable quantum teleportation of discrete-variable states.Phys

J. Sohl-Dickstein, P. B. Battaglino, and M. R. De- Weese, Physical Review Letters107, 10.1103/phys- revlett.107.220601 (2011)

work page doi:10.1103/phys- 2011

[12] [13]

De Bortoli, E

V. De Bortoli, E. Mathieu, M. Hutchinson, J. Thorn- ton, Y. W. Teh, and A. Doucet, Riemannian score-based generative modelling (2022)

work page 2022

[13] [14]

M. Arts, V. Garcia Satorras, C.-W. Huang, D. Z¨ ugner, M. Federici, C. Clementi, F. No´ e, R. Pinsler, and R. van den Berg, Journal of Chemical Theory and Com- putation19, 6151–6159 (2023)

work page 2023

[14] [15]

Zaidi, M

S. Zaidi, M. Schaarschmidt, J. Martens, H. Kim, Y. W. Teh, A. Sanchez-Gonzalez, P. Battaglia, R. Pascanu, and J. Godwin, Pre-training via denoising for molecular prop- erty prediction (2022)

work page 2022

[15] [16]

Holderrieth, Y

P. Holderrieth, Y. Xu, and T. Jaakkola, Hamiltonian score matching and generative flows (2024)

work page 2024

[16] [17]

Park, inThe Thirty-ninth Annual Conference on Neu- ral Information Processing Systems(2026)

S. Park, inThe Thirty-ninth Annual Conference on Neu- ral Information Processing Systems(2026)

work page 2026

[17] [18]

Binder and A

K. Binder and A. P. Young, Reviews of Modern Physics 58, 801–976 (1986)

work page 1986

[18] [19]

Bhattacharjee and S.-C

S. Bhattacharjee and S.-C. Lee, Testing the spin-bath view of self-attention: A hamiltonian analysis of gpt-2 transformer (2025)

work page 2025

[19] [20]

A. P. Ramirez, Annual Review of Materials Science24, 453–480 (1994)

work page 1994

[20] [21]

Villar, D

S. Villar, D. W. Hogg, K. Storey-Fisher, W. Yao, and B. Blum-Smith, inAdvances in Neural Informa- tion Processing Systems, Vol. 34, edited by M. Ran- zato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan (Curran Associates, Inc., 2021) pp. 28848– 28863

work page 2021

[21] [22]

V. G. Satorras, E. Hoogeboom, and M. Welling, E(n) equivariant graph neural networks (2021)

work page 2021

[22] [23]

Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E

S. Batzner, A. Musaelian, L. Sun, M. Geiger, J. P. Mailoa, M. Kornbluth, N. Molinari, T. E. Smidt, and B. Kozinsky, Nature Communications13, 10.1038/s41467-022-29939-5 (2022). 8

work page doi:10.1038/s41467-022-29939-5 2022

[23] [24]

Hukushima and K

K. Hukushima and K. Nemoto, Journal of the Physical Society of Japan65, 1604–1608 (1996)

work page 1996

[24] [25]

F. R. Brown and T. J. Woch, Physical Review Letters 58, 2394–2396 (1987)

work page 1987

[25] [26]

Creutz, Physical Review D36, 515–519 (1987)

M. Creutz, Physical Review D36, 515–519 (1987)

work page 1987