Recognition: no theorem link
FlowEqProp: Training Flow Matching Generative Models with Gradient Equilibrium Propagation
Pith reviewed 2026-05-10 17:46 UTC · model grok-4.3
The pith
Gradient Equilibrium Propagation trains flow matching generative models by encoding target velocities in equilibrium displacements.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Gradient Equilibrium Propagation enables training of flow matching generative models using only local equilibrium measurements and no backpropagation. It works by adding a purely quadratic spring potential that permits all network units to evolve, so that the equilibrium displacement encodes the target velocity field of the flow matching objective. When applied to a two-hidden-layer MLP on the Optical Recognition of Handwritten Digits dataset, the resulting FlowEqProp model generates recognizable digit samples across all ten classes with stable dynamics and supports improved generation through additional inference-time relaxation.
What carries the argument
Gradient Equilibrium Propagation (GradEP) using a quadratic spring potential that lets all units evolve and encodes the flow matching velocity field directly in the equilibrium displacement.
Load-bearing premise
A quadratic spring potential can be chosen so that the equilibrium displacement of every unit, including the visible units, accurately represents the target velocity field while keeping training stable and preserving hardware plausibility.
What would settle it
Train the described two-hidden-layer MLP on the Optical Recognition of Handwritten Digits dataset with GradEP and check whether it produces recognizable samples from all ten digit classes or whether the dynamics become unstable.
Figures
read the original abstract
We introduce Gradient Equilibrium Propagation (GradEP), a mechanism that extends Equilibrium Propagation (EP) to train energy gradients rather than energy minima, enabling EP to be applied to tasks where the learning objective depends on the velocity field of a convergent dynamical system. Instead of fixing the input during dynamics as in standard EP, GradEP introduces a spring potential that allows all units, including the visible units, to evolve, encoding the learned velocity in the equilibrium displacement. The spring and resulting nudge terms are both purely quadratic, preserving EP's hardware plausibility for neuromorphic implementation. As a first demonstration, we apply GradEP to flow matching for generative modelling - an approach we call FlowEqProp - training a two-hidden-layer MLP (24,896 parameters) on the Optical Recognition of Handwritten Digits dataset using only local equilibrium measurements and no backpropagation. The model generates recognisable digit samples across all ten classes with stable training dynamics. We further show that the time-independent energy landscape enables extended generation beyond the training horizon, producing sharper samples through additional inference-time computation - a property that maps naturally onto neuromorphic hardware, where longer relaxation yields higher-quality outputs. To our knowledge, this is the first demonstration of EP training a flow-based generative model.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Gradient Equilibrium Propagation (GradEP) as an extension of Equilibrium Propagation (EP) for training flow-matching generative models. By adding a quadratic spring potential, all units (including visible) evolve freely, with the equilibrium displacement encoding the target velocity field of the flow objective. Parameter gradients are then obtained from the free-nudged equilibrium difference using only local measurements and no backpropagation. The method is demonstrated by training a 24,896-parameter two-hidden-layer MLP on the Optical Recognition of Handwritten Digits dataset, producing recognizable digit samples with stable dynamics; the time-independent energy also permits extended inference-time generation for sharper outputs.
Significance. If the central encoding holds exactly, GradEP would enable hardware-plausible (neuromorphic) training of velocity-field objectives such as flow matching, extending EP beyond energy-minima tasks while preserving locality. The demonstration of stable training and extended generation on a small MLP is a concrete first step, and the absence of backpropagation plus the quadratic form of both spring and nudge terms are genuine strengths for neuromorphic mapping.
major comments (1)
- The derivation of the spring-potential fixed point (Methods section) must explicitly show that the equilibrium displacement Δx is exactly (or provably unbiasedly) proportional to the target velocity v_target for the flow-matching loss ||v_θ − v_target||². The current description leaves open whether this mapping remains exact outside the linear-response regime, for nonlinear unit activations, or without explicit time-conditioning; if only approximate, the resulting EP update optimizes a surrogate rather than the intended objective. This is load-bearing for the central claim that GradEP trains the true flow-matching gradient via local measurements.
minor comments (2)
- Abstract and Results: no quantitative metrics (e.g., negative log-likelihood, FID, or sample quality scores), no baselines (standard flow matching with backprop or other EP variants), and no ablations on spring constant or nudge strength are reported. These are required to substantiate “stable dynamics” and “recognizable samples” beyond visual inspection.
- The model is small (≈25 k parameters) and the dataset is simple; discussion of scaling behavior or failure modes on higher-dimensional data would strengthen the neuromorphic-plausibility argument.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive feedback. We address the single major comment below and will revise the manuscript to provide the requested explicit derivation.
read point-by-point responses
-
Referee: The derivation of the spring-potential fixed point (Methods section) must explicitly show that the equilibrium displacement Δx is exactly (or provably unbiasedly) proportional to the target velocity v_target for the flow-matching loss ||v_θ − v_target||². The current description leaves open whether this mapping remains exact outside the linear-response regime, for nonlinear unit activations, or without explicit time-conditioning; if only approximate, the resulting EP update optimizes a surrogate rather than the intended objective. This is load-bearing for the central claim that GradEP trains the true flow-matching gradient via local measurements.
Authors: We appreciate the referee's emphasis on this foundational aspect. In the revised Methods section, we will expand the derivation of the spring-potential fixed point to explicitly demonstrate that the equilibrium displacement Δx is exactly proportional to the target velocity v_target (with proportionality constant set by the spring stiffness). The derivation follows directly from the stationarity condition ∇_x E_network(x) + k Δx = 0 at the fixed point of the total energy, which encodes v_target via the quadratic spring term. This relation holds exactly for arbitrary differentiable (including nonlinear) activation functions, without invoking linear-response approximations or requiring explicit time-conditioning, because the energy remains time-independent by construction. Consequently, the parameter gradient extracted from the free-nudged equilibrium difference is the true gradient of the flow-matching objective ||v_θ − v_target||² rather than a surrogate. We will include the full algebraic steps to make this mapping unambiguous. revision: yes
Circularity Check
No circularity: novel spring-potential construction for velocity encoding is introduced rather than reduced to prior fits or definitions
full rationale
The paper defines GradEP by adding an explicit quadratic spring term to the energy function, allowing visible units to evolve so that equilibrium displacement represents the flow-matching velocity field. This is a new ansatz presented with hardware-plausibility arguments and demonstrated empirically on a 25k-parameter MLP, without any quoted reduction of the target loss gradient to a fitted parameter or self-referential definition. No load-bearing self-citations or uniqueness theorems from prior author work are invoked to force the result. The central claim (local EP updates optimize the flow objective via displacement encoding) rests on the stated dynamics and empirical samples rather than tautological equivalence to inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Dynamical system converges to a stable equilibrium under the combined energy and spring potential
- ad hoc to paper Quadratic potentials are hardware-plausible for neuromorphic implementation
invented entities (2)
-
Gradient Equilibrium Propagation (GradEP)
no independent evidence
-
spring potential
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Michal Balcerak, Tamaz Amiranashvili, Antonio Terpin, Suprosanna Shit, Lea Bogensperger, Sebastian Kaltenbach, Petros Koumoutsakos, and Bjoern Menze
-
[2]
arXiv preprint arXiv:2504.10612 , year=
Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling. doi:10.48550/arXiv.2504.10612 arXiv:2504.10612 [cs]. 3One might expect that retaining the spring at inference — computing velocities as 𝑣= 𝛼𝜆(𝑥 ∗ −𝑥) to match the training dynamics — would produce better results. In practice, it produces comparable generation quality,...
-
[3]
Fevzi. Alimoglu E. Alpaydin. 1996. Pen-Based Recognition of Handwritten Digits. doi:10.24432/C5MG6K
-
[4]
Maxence Ernoult, Julie Grollier, Damien Querlioz, Yoshua Bengio, and Ben- jamin Scellier. 2019. Updates of Equilibrium Prop Match Gradients of Back- prop Through Time in an RNN with Static Input. doi:10.48550/arXiv.1905.13633 arXiv:1905.13633 [cs]
- [5]
-
[6]
Benjamin Hoover, Yuchen Liang, Bao Pham, Rameswar Panda, Hendrik Stro- belt, Duen Horng Chau, Mohammed Zaki, and Dmitry Krotov. 2023. En- ergy Transformer.Advances in Neural Information Processing Systems36 (Dec. 2023), 27532–27559. https://proceedings.neurips.cc/paper_files/paper/2023/hash/ 57a9b97477b67936298489e3c1417b0a-Abstract-Conference.html
2023
- [7]
-
[8]
Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. 2023. Flow Matching for Generative Modeling. doi:10.48550/arXiv.2210.02747 arXiv:2210.02747 [cs]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2210.02747 2023
- [9]
-
[10]
Hubert Ramsauer, Bernhard Schäfl, Johannes Lehner, Philipp Seidl, Michael Widrich, Thomas Adler, Lukas Gruber, Markus Holzleitner, Milena Pavlović, Geir Kjetil Sandve, Victor Greiff, David Kreil, Michael Kopp, Günter Klambauer, Johannes Brandstetter, and Sepp Hochreiter. 2021. Hopfield Networks is All You Need. http://arxiv.org/abs/2008.02217 arXiv:2008.0...
-
[11]
Benjamin Scellier and Yoshua Bengio. 2017. Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation. http://arxiv.org/ abs/1602.05179 arXiv:1602.05179
work page Pith review arXiv 2017
-
[12]
Yang Song and Stefano Ermon. 2019. Generative Modeling by Estimating Gra- dients of the Data Distribution. InAdvances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper_ files/paper/2019/hash/3001ef257407d5a371a96dcd947c7d93-Abstract.html
2019
-
[13]
Alexander Tong, Kilian Fatras, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, and Yoshua Bengio. 2024. Improving and generalizing flow-based generative models with minibatch optimal transport. doi:10.48550/arXiv.2302.00482 arXiv:2302.00482 [cs]
work page internal anchor Pith review doi:10.48550/arxiv.2302.00482 2024
-
[14]
Tianshi Wang, Leon Wu, Parth Nobel, and Jaijeet Roychowdhury. 2021. Solv- ing combinatorial optimisation problems using oscillator based Ising machines. Natural Computing20, 2 (June 2021), 287–306. doi:10.1007/s11047-021-09845-3 A GradEP Derivation Free phase equilibrium.At equilibrium of the spring-clamped energy (5),∇ 𝑥 𝐸spring =0gives: ∇𝑥 𝐸int (𝑥 ∗, ℎ∗...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.