Divergence-Suppressing Couplings for Rectified Flow
Pith reviewed 2026-05-20 11:27 UTC · model grok-4.3
The pith
Rectified Flow trajectories straighten when the divergent part of the velocity is suppressed during coupling generation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Trajectory entanglement in Rectified Flow is often associated with regions of nonzero divergence in the learned velocity field, where local expansion or contraction distorts trajectories and steers particles away from ideal endpoints. The proposed divergence-suppressing couplings attenuate the divergent component of the learned velocity during coupling generation. This offline modification yields consistent improvements on 2D synthetic benchmarks and on image generation.
What carries the argument
Divergence-suppressing couplings: an offline correction that attenuates the divergent component of the learned velocity during coupling generation.
If this is right
- Trajectories become straighter or nearly so, as intended by Rectified Flow.
- Consistent performance gains on 2D synthetic benchmarks.
- Improved quality in image generation tasks.
- No additional wall-clock cost at deployment since plain Euler is used.
- The correction is paid only once per coupling pair and amortized over training.
Where Pith is reading between the lines
- Similar divergence corrections could be explored in other continuous normalizing flow or flow-matching models beyond Rectified Flow.
- The approach might generalize to higher-dimensional or more complex data distributions where divergence issues are pronounced.
- Testing on video or 3D generation tasks would reveal if the benefits scale to more demanding generative applications.
Load-bearing premise
That attenuating the divergent component of the velocity field when constructing couplings will produce measurably straighter trajectories and improved downstream generation quality without introducing compensating distortions or training instabilities.
What would settle it
Running the divergence-suppressing correction on a Rectified Flow model and observing no reduction in trajectory curvature or no improvement in generation metrics like FID on standard benchmarks would falsify the claim.
Figures
read the original abstract
The promise of Rectified Flow rests on producing self-generated couplings whose trajectories are straight, or nearly so. In practice, trajectories generated by the base flow model can bend and intertwine, and the resulting coupling inherits this distortion. In this paper, we identify that such trajectory entanglement is often associated with regions of nonzero divergence in the learned velocity field, where local expansion or contraction distorts trajectories and steers particles away from their ideal endpoints. We then propose divergence-suppressing couplings for Rectified Flow, an offline correction that attenuate the divergent component of the learned velocity during coupling generation. The correction is paid only once per coupling pair and amortized over training, so deployment runs plain Euler at identical wall-clock cost to standard Rectified Flow. Empirically, this offline modification yields consistent improvements on 2D synthetic benchmarks and on image generation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that nonzero divergence in the learned velocity field of Rectified Flow models causes trajectory bending and entanglement in self-generated couplings. It proposes an offline divergence-suppressing correction that attenuates the divergent component of the velocity during coupling generation. This correction is applied once per pair and amortized over training, so that inference uses unmodified Euler integration at the same cost as standard Rectified Flow. The method is reported to produce straighter trajectories and yield consistent empirical improvements on 2D synthetic benchmarks and image generation tasks.
Significance. If the correction preserves valid transport plans while reducing bending, the approach offers a practical, zero-inference-cost improvement to Rectified Flow training that could benefit generative modeling pipelines. The offline amortization and focus on divergence as a distortion source are conceptually appealing strengths.
major comments (2)
- [Section 3 (method description)] The manuscript provides no derivation or argument showing that attenuating the divergent component of the velocity preserves the boundary conditions of the original coupling (i.e., that numerical integration of the modified velocity from x0 still terminates at the target x1). This is load-bearing for the central claim that the resulting couplings remain valid rectified-flow transport plans without endpoint distortion or the need for re-projection.
- [Abstract and Experiments section] The abstract asserts 'consistent improvements' on 2D benchmarks and image generation, yet the provided text supplies no quantitative metrics, error bars, ablation controls on the attenuation strength, or exact implementation details of the divergence operator. This leaves the empirical support for the central claim only weakly substantiated.
minor comments (2)
- [Section 3.1] Clarify the precise mathematical definition of the attenuation operator, including whether it uses an exact Helmholtz decomposition or an approximation, and how the scale of attenuation is chosen.
- [Introduction] Add a short discussion or reference to related work on velocity-field regularization or divergence-free flow models to situate the contribution.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We have addressed each major point below and revised the manuscript to strengthen the presentation where appropriate.
read point-by-point responses
-
Referee: [Section 3 (method description)] The manuscript provides no derivation or argument showing that attenuating the divergent component of the velocity preserves the boundary conditions of the original coupling (i.e., that numerical integration of the modified velocity from x0 still terminates at the target x1). This is load-bearing for the central claim that the resulting couplings remain valid rectified-flow transport plans without endpoint distortion or the need for re-projection.
Authors: We acknowledge that the original manuscript did not contain an explicit derivation of endpoint preservation under the divergence-suppressing correction. In the revised version we have added a new subsection in Section 3 that supplies the missing argument. Briefly, the velocity field is decomposed via Helmholtz decomposition into a divergence-free (solenoidal) part and a divergent (irrotational) part. The correction attenuates only the divergent component while leaving the solenoidal component unchanged. Because the original rectified-flow velocity already satisfies the boundary condition that the integrated displacement equals x1 − x0, and the divergent correction is constructed to be orthogonal to the transport direction in the L2 sense along each trajectory, the net displacement remains invariant. We include a short proof sketch and a numerical check confirming that endpoint error stays at machine precision after correction. We have also clarified that no re-projection step is required. revision: yes
-
Referee: [Abstract and Experiments section] The abstract asserts 'consistent improvements' on 2D benchmarks and image generation, yet the provided text supplies no quantitative metrics, error bars, ablation controls on the attenuation strength, or exact implementation details of the divergence operator. This leaves the empirical support for the central claim only weakly substantiated.
Authors: We agree that the abstract and experimental reporting could be more explicit. In the revised manuscript we have expanded both the abstract and the Experiments section. We now report concrete metrics (FID scores on CIFAR-10 and ImageNet subsets, average trajectory curvature on 2D Gaussians and moons, and endpoint error) together with standard deviations over five independent runs. An ablation table varying the attenuation coefficient λ from 0 to 1 is included, and we supply the precise finite-difference stencil and normalization used for the divergence operator in the supplementary material. These additions directly substantiate the claim of consistent improvements. revision: yes
Circularity Check
No circularity: proposal is an independent offline correction with empirical validation
full rationale
The paper presents divergence-suppressing couplings as a new offline attenuation of the divergent velocity component during coupling generation, amortized over training with no change to inference cost. No equations, derivations, or self-citations are shown that reduce the claimed straighter trajectories or improved generation quality to a fitted parameter or self-defined quantity. The central claim rests on the empirical observation that nonzero divergence correlates with trajectory bending, followed by a proposed correction whose effect is measured on external 2D and image benchmarks rather than by construction. This is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We control compressibility by acting on the state rather than the field. At each Euler step, we displace the particle to a nearby state x⋆ where |∇·v(x⋆, t)| is smaller, then integrate the unmodified velocity v(x⋆, t).
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Any sufficiently regular velocity field admits a unique split v = u (transport, ∇·u=0) + ∇ϕ (dipole carrying all compressibility).
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Avishek Joey Bose, Tara Akhound-Sadegh, Guillaume Huguet, Kilian Fatras, Jarrid Rector-Brooks, Cheng-Hao Liu, Andrei Cristian Nica, Maksym Korablyov, Michael Bronstein, and Alexander Tong. Se (3)-stochastic flow matching for protein backbone generation.arXiv preprint arXiv:2310.02391,
-
[2]
Training-Free Refinement of Flow Matching with Divergence-based Sampling
Yeonwoo Cha, Jaehoon Yoo, Semin Kim, Yunseo Park, Jinhyeon Kwon, and Seunghoon Hong. Training-free refinement of flow matching with divergence-based sampling.arXiv preprint arXiv:2604.04646,
work page internal anchor Pith review Pith/arXiv arXiv
-
[3]
FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models
Will Grathwohl, Ricky TQ Chen, Jesse Bettencourt, Ilya Sutskever, and David Duvenaud. Ffjord: Free-form continuous dynamics for scalable reversible generative models.arXiv preprint arXiv:1810.01367,
work page internal anchor Pith review Pith/arXiv arXiv
-
[4]
Improving flow matching by aligning flow divergence.arXiv preprint arXiv:2602.00869,
Yuhao Huang, Taos Transue, Shih-Hsin Wang, William Feldman, Hong Zhang, and Bao Wang. Improving flow matching by aligning flow divergence.arXiv preprint arXiv:2602.00869,
-
[5]
Movie Gen: A Cast of Media Foundation Models
Adam Polyak, Amit Zohar, Andrew Brown, Andros Tjandra, Animesh Sinha, Ann Lee, Apoorv Vyas, Bowen Shi, Chih-Yao Ma, Ching-Yao Chuang, et al. Movie gen: A cast of media foundation models.arXiv preprint arXiv:2410.13720,
work page internal anchor Pith review Pith/arXiv arXiv
-
[6]
Multisample flow matching: Straightening flows with minibatch couplings
Aram-Alexandre Pooladian, Heli Ben-Hamu, Carles Domingo-Enrich, Brandon Amos, Yaron Lipman, and Ricky TQ Chen. Multisample flow matching: Straightening flows with minibatch couplings. arXiv preprint arXiv:2304.14772,
-
[7]
Progressive Distillation for Fast Sampling of Diffusion Models
11 Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models.arXiv preprint arXiv:2202.00512,
work page internal anchor Pith review Pith/arXiv arXiv
-
[8]
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations.arXiv preprint arXiv:2011.13456,
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[9]
Improving and generalizing flow-based generative models with minibatch optimal transport
Alexander Tong, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Kilian Fatras, Guy Wolf, and Yoshua Bengio. Improving and generalizing flow-matching for data transport and generation.arXiv preprint arXiv:2302.00482,
work page internal anchor Pith review Pith/arXiv arXiv
-
[10]
Fast sampling of diffusion models with exponential integrator
Qinsheng Zhang and Yongxin Chen. Fast sampling of diffusion models with exponential integrator. arXiv preprint arXiv:2204.13902,
-
[11]
( nf = 128, channel multipliers (1,2,2,2) , 4 residual blocks per resolution, attention at16×16) for 50,000 iterations with batch size 128, Adam optimiser (lr=2×10−4, β1 =0.9), and EMA decay 0.9999. InStage 2(reflow), we generate 50,000 (z0, z1) coupling pairs from the base model and train a new model on those pairs for a further 50,000 iterations under t...
work page 2023
-
[12]
Total: (m+ 1)(1 +n h) passes per corrected step
calls to HUTCHINSONDIVESTIMATE, each costing 1 +n h model passes (one forward, nh VJP backward). Total: (m+ 1)(1 +n h) passes per corrected step. For m=n h = 8 : 81 passes, applied to ⌈tstop ·N⌉= 10 of 20 Euler steps. No second-order computation graph is constructed at any point. D Ablation Study: DS-RectFlowδfor RK45 Solver We ablate the divergence scale...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.