pith. sign in

arxiv: 2605.15419 · v2 · pith:RCTLJG2Enew · submitted 2026-05-14 · 💻 cs.LG

Lagrangian Flow Matching: A Least-Action Framework for Principled Path Design

Pith reviewed 2026-05-21 08:06 UTC · model grok-4.3

classification 💻 cs.LG
keywords flow matchingLagrangian mechanicsoptimal transportprobability pathsgenerative modelingsimulation-free trainingleast action principle
0
0 comments X

The pith

Minimizing the action of a general Lagrangian yields simulation-free training objectives for designing probability paths in flow matching.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Lagrangian flow matching determines the probability path and velocity field by minimizing the action of a general Lagrangian subject to the continuity equation and prescribed initial and target distributions. This setup generalizes prior flow matching constructions, which correspond to the kinetic Lagrangian that produces straight-line trajectories between coupled points. The dynamic minimization problem is shown to be equivalent to a static optimal transport problem, which in turn supplies simulation-free training objectives for the velocity field. If the claim holds, it would allow the design of new paths based on different Lagrangians, such as harmonic oscillators, that produce altered learned dynamics in generative tasks.

Core claim

The central claim is that the dynamic problem of minimizing the action of a general Lagrangian subject to the continuity equation and prescribed endpoints admits an equivalent static optimal transport formulation. This equivalence produces a family of simulation-free training objectives. It recovers optimal transport-based flow matching as the kinetic special case and the trigonometric variance-preserving diffusion path as the harmonic-oscillator case. More general Lagrangians lead to new probability paths and velocity fields that induce meaningful changes in the learned dynamics.

What carries the argument

The least-action principle, in which the probability path and associated velocity field are found by minimizing the integral of the Lagrangian along trajectories subject to the continuity equation for the probability density.

If this is right

  • OT-based flow matching emerges when the Lagrangian reduces to kinetic energy alone.
  • The trigonometric variance-preserving diffusion paths emerge for the harmonic oscillator Lagrangian.
  • General Lagrangians produce new families of curved or otherwise modified probability paths.
  • These new paths lead to different velocity fields that remain competitive in performance with standard conditional flow matching.
  • The equivalence allows simulation-free training for any such Lagrangian choice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Applying this framework to Lagrangians with additional physical terms could incorporate conservation principles into generative model training.
  • Testing these paths on datasets with specific geometric structures might reveal advantages over straight-line transports.
  • The mechanical interpretation could help explain why certain path choices work better for particular data distributions.
  • Extensions might connect this to variational principles in related generative modeling approaches.

Load-bearing premise

The dynamic optimization of the Lagrangian action under the continuity equation and fixed endpoints is equivalent to a static optimal transport problem between the initial and final distributions.

What would settle it

Deriving the velocity field for a Lagrangian that includes a non-kinetic potential term and checking whether the resulting paths satisfy the continuity equation while differing from straight-line trajectories would test the claimed equivalence; mismatch in either property would falsify it.

Figures

Figures reproduced from arXiv: 2605.15419 by Junzhe Zhang, Shukai Du, Yiming Li.

Figure 1
Figure 1. Figure 1: Lagrangian flow matching unifies trajectory selection (top [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Position γ ωptq and velocity γ9 ωptq of the harmonic least-action trajectory (4) for ω P p0, πq, with endpoints x0 “ 0 and x1 “ 1. Lagrangian mechanics provides a variational principle for selecting dynamics through an ac￾tion functional, and has long served as a unifying language for describing physical systems [4]. A Lagrangian is a function Lpx, v, tq of position x P R d , velocity v P R d , and time t … view at source ↗
Figure 3
Figure 3. Figure 3: Mini-batch harmonic flow matching trajectories from a 8-Gaussian-mixture to a double [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Mini-batch harmonic flow matching (ω “ 1) from an 8-Gaussian-mixture source (olive) to a double-moon target (blue), swept over OT batch size n P t1, 10, 25, 50, 100u. Increasing n tightens the empirical coupling toward the OT solution: compare the many-to-many fan-out at n “ 1 to the bundled, mode-to-region routing at n “ 100. Trajectory curvature is set by ω and is preserved across all panels — the two ax… view at source ↗
Figure 5
Figure 5. Figure 5: Sample quality vs. inference budget on N Ñmoons across three ODE solvers (Euler, midpoint, RK4). OT-Harmonic reduce the 2-Wasserstein distance for a fixed NFE during inference [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: CIFAR-10 images generated by OT-CFM (left) and [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: OT-Aniso flow on the 8-GaussiansÑtwo-moons benchmark. Synthetic experiments [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Sample quality vs. inference budget. We plot the 2-Wasserstein distance [PITH_FULL_IMAGE:figures/full_fig_p024_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Uncurated CIFAR-10 samples from each flow-matching variant after [PITH_FULL_IMAGE:figures/full_fig_p026_9.png] view at source ↗
read the original abstract

Flow matching trains a neural velocity field by regression against a target velocity associated with a prescribed probability path connecting a simple initial distribution to the data distribution. A central design choice is the path itself. Existing constructions, including rectified and optimal-transport-based paths, transport samples along straight lines between coupled endpoints and thus cover only a narrow class of dynamics. We observe that this corresponds to the simplest case of the least-action principle in classical mechanics, in which the kinetic Lagrangian yields free-particle straight-line trajectories. Building on this observation, we propose Lagrangian flow matching, a physics-based framework in which the probability path and velocity field are determined by minimizing the action of a general Lagrangian subject to the continuity equation and the prescribed endpoints. We show that this dynamic problem admits an equivalent static optimal transport (OT) formulation, yielding a family of simulation-free training objectives that recover OT-based flow matching as the kinetic special case and the trigonometric variance-preserving diffusion path as the harmonic-oscillator case. More general Lagrangians give rise to new probability paths and velocity fields, and numerical experiments show that they induce meaningful changes in the learned dynamics while remaining competitive with existing conditional flow matching models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces Lagrangian Flow Matching, framing the design of probability paths in flow matching as minimization of the action of a general Lagrangian subject to the continuity equation and fixed marginals. It claims this dynamic problem is equivalent to a static optimal transport formulation, producing a family of simulation-free training objectives. Special cases recover OT-based flow matching (kinetic Lagrangian) and trigonometric variance-preserving diffusion paths (harmonic oscillator Lagrangian). Numerical experiments indicate that more general Lagrangians induce meaningful changes in learned dynamics while remaining competitive with existing conditional flow matching models.

Significance. If the claimed equivalence holds with simulation-free objectives for arbitrary Lagrangians, the framework would supply a principled, physics-motivated method for constructing probability paths beyond straight-line or fixed diffusion schedules. Recovering known methods as special cases strengthens the contribution, and the potential for new, data-adapted dynamics could improve generative modeling performance.

major comments (2)
  1. [Abstract] Abstract: The central claim that the dynamic least-action problem admits an equivalent static OT formulation yielding simulation-free objectives for general Lagrangians is load-bearing but unsupported by derivation. For the kinetic case the target velocity is (x1 − x0); for the harmonic oscillator it is trigonometric. For a generic Lagrangian the Euler-Lagrange equation produces a two-point boundary-value problem whose solution generally lacks closed form, so evaluating the regression target requires numerical integration and the objective ceases to be simulation-free.
  2. [Equivalence to static OT] The manuscript asserts equivalence to a static OT problem but supplies only the high-level statement without the explicit reduction, the resulting regression target, or verification that the continuity-equation constraint is preserved under the static formulation. A concrete derivation or worked example for a non-quadratic Lagrangian is required to substantiate the family of simulation-free objectives.
minor comments (2)
  1. [Abstract] The abstract refers to 'numerical experiments' showing competitive results and meaningful changes in dynamics, yet no datasets, baselines, metrics, or specific Lagrangians tested are mentioned.
  2. Notation for the general Lagrangian L and the action functional should be introduced with an explicit equation before the equivalence claim is stated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. We address the major comments point by point below, clarifying the equivalence and committing to strengthen the presentation with explicit derivations.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that the dynamic least-action problem admits an equivalent static OT formulation yielding simulation-free objectives for general Lagrangians is load-bearing but unsupported by derivation. For the kinetic case the target velocity is (x1 − x0); for the harmonic oscillator it is trigonometric. For a generic Lagrangian the Euler-Lagrange equation produces a two-point boundary-value problem whose solution generally lacks closed form, so evaluating the regression target requires numerical integration and the objective ceases to be simulation-free.

    Authors: We agree that the abstract statement is concise and that the supporting derivation merits expansion. The equivalence is obtained by showing that the action-minimizing velocity field satisfying the continuity equation and fixed marginals coincides with the optimal transport map for the cost functional induced by the Lagrangian; the resulting regression target is then the velocity along the associated geodesic. For the kinetic and harmonic cases this velocity admits the closed forms noted by the referee. For general Lagrangians the two-point boundary-value problem may indeed lack a closed form, in which case the target is obtained by solving the Euler-Lagrange ODE once per training pair (a fixed, offline cost that does not involve simulating the generative dynamics). We will add a self-contained derivation of the static reduction together with a worked non-quadratic example in the revised manuscript to make this distinction explicit. revision: yes

  2. Referee: [Equivalence to static OT] The manuscript asserts equivalence to a static OT problem but supplies only the high-level statement without the explicit reduction, the resulting regression target, or verification that the continuity-equation constraint is preserved under the static formulation. A concrete derivation or worked example for a non-quadratic Lagrangian is required to substantiate the family of simulation-free objectives.

    Authors: We acknowledge that the main text presents the equivalence at a high level. The reduction proceeds by substituting the continuity-equation constraint into the action integral, yielding a static variational problem whose Euler-Lagrange conditions recover the same velocity field; because the optimal map automatically satisfies the prescribed marginals, the continuity equation remains satisfied by construction. We will insert the full step-by-step reduction (including verification of the constraint) and provide an explicit non-quadratic Lagrangian example with its corresponding regression target in the revision. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation proceeds from least-action principle to static OT equivalence without reducing to inputs or self-citations

full rationale

The paper starts from the classical least-action principle applied to a general Lagrangian subject to the continuity equation and fixed marginals, then proves an equivalence to a static optimal transport problem whose solution supplies the velocity targets. Special cases (kinetic Lagrangian recovering OT flow matching; harmonic oscillator recovering trigonometric paths) follow directly as mathematical reductions rather than being presupposed. No step defines a quantity in terms of itself, renames a fitted result as a prediction, or relies on a load-bearing self-citation whose content is unverified outside the present work. The framework is therefore self-contained against external benchmarks in classical mechanics and optimal transport.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The abstract relies on standard mathematical constraints from fluid dynamics and optimal transport with no new free parameters or invented entities described.

axioms (2)
  • standard math The evolution of the probability density obeys the continuity equation.
    Used as the constraint in the dynamic least-action problem.
  • domain assumption Endpoints consist of a simple initial distribution and the data distribution.
    Standard setup for flow matching between noise and data.

pith-pipeline@v0.9.0 · 5735 in / 1216 out tokens · 39159 ms · 2026-05-21T08:06:04.590884+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost costAlphaLog_fourth_deriv_at_zero / dAlembert_cosh_solution_aczel echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    A canonical non-kinetic example is Lω(x, v) = ½‖v‖² - ½ω²‖x‖² ... Euler–Lagrange equation γ̈ + ω²γ = 0 has unique solution γωx0,x1(t) = sin(ω(1-t))/sinω x0 + sin(ωt)/sinω x1

  • IndisputableMonolith/Foundation/BranchSelection branch_selection echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    We show that this dynamic problem admits an equivalent static optimal transport (OT) formulation, yielding a family of simulation-free training objectives

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    M. S. Albergo, N. M. Boffi, and E. Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions.Journal of Machine Learning Research, 26(209):1–80, 2025

  2. [2]

    M. S. Albergo and E. Vanden-Eijnden. Building normalizing flows with stochastic interpolants. InThe Eleventh International Conference on Learning Representations (ICLR), 2023

  3. [3]

    Ambrosio, N

    L. Ambrosio, N. Gigli, and G. Savaré.Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Birkhäuser, Basel, 2005

  4. [4]

    V . I. Arnold, K. V ogtmann, and A. Weinstein.Mathematical Methods of Classical Mechanics, volume 60 ofGraduate Texts in Mathematics. Springer, New York, 2 edition, 1989

  5. [5]

    Balcerak, T

    M. Balcerak, T. Amiranashvili, A. Terpin, S. Shit, L. Bogensperger, S. Kaltenbach, P. Koumout- sakos, and B. Menze. Energy matching: Unifying flow matching and energy-based models for generative modeling.arXiv preprint arXiv:2504.10612, 2025

  6. [6]

    Benamou and Y

    J.-D. Benamou and Y . Brenier. A computational fluid mechanics solution to the Monge– Kantorovich mass transfer problem.Numerische Mathematik, 84(3):375–393, 2000

  7. [7]

    Bernard and B

    P. Bernard and B. Buffoni. Optimal mass transportation and Mather theory.Journal of the European Mathematical Society, 9(1):85–121, 2007

  8. [8]

    Burkhardt, J

    D. Burkhardt, J. Bloom, R. Cannoodt, M. D. Luecken, S. Krishnaswamy, C. Lance, A. O. Pisco, and F. J. Theis. Multimodal single-cell integration across time, individuals, and batches. In NeurIPS 2022 Competition Track, 2022

  9. [9]

    R. T. Q. Chen, Y . Rubanova, J. Bettencourt, and D. K. Duvenaud. Neural ordinary differential equations. InAdvances in Neural Information Processing Systems, volume 31, 2018

  10. [10]

    Y . Chen, T. T. Georgiou, and M. Pavon. On the relation between optimal transport and Schrödinger bridges: A stochastic control viewpoint.Journal of Optimization Theory and Applications, 169(2):671–691, 2016

  11. [11]

    De Bortoli, J

    V . De Bortoli, J. Thornton, J. Heng, and A. Doucet. Diffusion Schrödinger bridge with applications to score-based generative modeling. InAdvances in Neural Information Processing Systems, volume 34, pages 17695–17709, 2021

  12. [12]

    Du and I

    Y . Du and I. Mordatch. Implicit generation and modeling with energy based models. In Advances in Neural Information Processing Systems, volume 32, 2019

  13. [13]

    Grathwohl, R

    W. Grathwohl, R. T. Q. Chen, J. Bettencourt, and D. Duvenaud. Scalable reversible genera- tive models with free-form continuous dynamics. InInternational Conference on Learning Representations (ICLR), 2019

  14. [14]

    J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems, volume 33, pages 6840–6851, 2020. 10

  15. [15]

    Neural lagrangian Schrödinger bridge: Diffusion modeling for population dynamics.arXiv preprint arXiv:2204.04853, 2022

    T. Koshizuka and I. Sato. Neural Lagrangian Schrödinger bridge: Diffusion modeling for population dynamics.arXiv preprint arXiv:2204.04853, 2022

  16. [16]

    H. W. Kuhn. The Hungarian method for the assignment problem.Naval Research Logistics Quarterly, 2(1–2):83–97, 1955

  17. [17]

    LeCun, S

    Y . LeCun, S. Chopra, R. Hadsell, M. Ranzato, and F. J. Huang. A tutorial on energy-based learning. In G. Bakir, T. Hofman, B. Schölkopf, A. Smola, and B. Taskar, editors,Predicting Structured Data. MIT Press, Cambridge, MA, 2006

  18. [18]

    C. Léonard. A survey of the Schrödinger problem and some of its connections with optimal transport.Discrete and Continuous Dynamical Systems, 34(4):1533–1574, 2014

  19. [19]

    Lipman, R

    Y . Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le. Flow matching for generative modeling. InThe Eleventh International Conference on Learning Representations (ICLR), 2023

  20. [20]

    X. Liu, C. Gong, and Q. Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. InThe Eleventh International Conference on Learning Representations (ICLR), 2023

  21. [21]

    K. R. Moon, D. van Dijk, Z. Wang, S. Gigante, D. B. Burkhardt, W. S. Chen, K. Yim, A. van den Elzen, M. J. Hirn, R. R. Coifman, et al. Visualizing structure and transitions in high-dimensional biological data.Nature Biotechnology, 37(12):1482–1492, 2019

  22. [22]

    Neklyudov, R

    K. Neklyudov, R. Brekelmans, D. Severo, and A. Makhzani. Action matching: Learning stochastic dynamics from samples. InProceedings of the 40th International Conference on Machine Learning (ICML), volume 202 ofProceedings of Machine Learning Research, pages 25858–25889. PMLR, 2023

  23. [23]

    Neklyudov, R

    K. Neklyudov, R. Brekelmans, A. Tong, L. Atanackovic, Q. Liu, and A. Makhzani. A com- putational framework for solving Wasserstein Lagrangian flows. InProceedings of the 41st International Conference on Machine Learning (ICML), 2024

  24. [24]

    Pooladian, H

    A.-A. Pooladian, H. Ben-Hamu, C. Domingo-Enrich, B. Amos, Y . Lipman, and R. T. Q. Chen. Multisample flow matching: Straightening flows with minibatch couplings. InProceedings of the 40th International Conference on Machine Learning (ICML), volume 202 ofProceedings of Machine Learning Research, pages 28100–28127. PMLR, 2023

  25. [25]

    Pooladian, C

    A.-A. Pooladian, C. Domingo-Enrich, R. T. Q. Chen, and B. Amos. Neural optimal transport with Lagrangian costs. InProceedings of the 40th Conference on Uncertainty in Artificial Intelligence (UAI), 2024

  26. [26]

    Sohl-Dickstein, E

    J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. InProceedings of the 32nd International Conference on Machine Learning (ICML), volume 37 ofProceedings of Machine Learning Research, pages 2256–2265. PMLR, 2015

  27. [27]

    Y . Song, C. Durkan, I. Murray, and S. Ermon. Maximum likelihood training of score-based diffusion models. InAdvances in Neural Information Processing Systems, volume 34, pages 1415–1428, 2021

  28. [28]

    Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. InInternational Conference on Learning Representations (ICLR), 2021

  29. [29]

    A. Tong, K. Fatras, N. Malkin, G. Huguet, Y . Zhang, J. Rector-Brooks, G. Wolf, and Y . Bengio. Improving and generalizing flow-based generative models with minibatch optimal transport. Transactions on Machine Learning Research, 2024. Expert Certification

  30. [30]

    Villani.Optimal Transport: Old and New, volume 338 ofGrundlehren der mathematischen Wissenschaften

    C. Villani.Optimal Transport: Old and New, volume 338 ofGrundlehren der mathematischen Wissenschaften. Springer, Berlin, Heidelberg, 2009. 11 A Related Work Lagrangian flow matching sits at the intersection of two lines of work: flow- and diffusion-based generative models built on prescribed probability paths, and variational or action-based approaches to...

  31. [31]

    ´ωsinpωtqx 0 `ωcospωtqB. Per-pair kinetic energy.The kinetic energy along ϕ‹ is kωpx0, x1q “ ş1 0 1 2 } 9ϕ‹ t }2 dt. Expanding the square gives } 9ϕ‹ t }2 “ω 2

    unifies these constructions through a stochastic differential equation, with sampling carried out by a reverse-time SDE or its associated probability-flow ODE. The probability path is implicitly prescribed by the forward noising schedule, and standard choices such as the variance-preserving and variance-exploding processes induce specific families of marg...