Probability-Conserving Flow Guidance
Pith reviewed 2026-05-20 05:40 UTC · model grok-4.3
The pith
Guidance in diffusion models breaks probability conservation unless its diverging divergence term is scheduled to stay bounded near the data manifold.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that guidance effects decompose invariantly into a divergence term and a score-parallel term across parameterizations, and that the divergence term blows up structurally as the sampling trajectory approaches the data manifold. This motivates a time-dependent schedule for the divergence alongside score-parallel attenuation, resulting in the Adaptive Manifold Guidance rule that bounds both contributions while preserving the probability flow at no extra computational cost during inference.
What carries the argument
The decomposition of the guidance velocity into a divergence term and a score-parallel term derived from the continuity equation, which remains invariant across different model parameterizations.
Load-bearing premise
The sampling trajectory follows the continuity equation exactly as a continuous probability flow, with the divergence and score-parallel decomposition staying dominant and invariant near the data manifold.
What would settle it
A numerical simulation in a low-dimensional Gaussian mixture model where the measured divergence of the guided velocity is tracked as the trajectory approaches the data support, checking if it increases without bound under standard guidance but stays controlled with the proposed schedule.
Figures
read the original abstract
Diffusion and flow-based generative models dominate visual synthesis, with guidance aligning samples to user input and improving perceptual quality. However, Classifier-Free Guidance (CFG) and extrapolation-based methods are heuristic linear combinations of velocities/scores that ignore the generative manifold geometry, breaking probability conservation and driving samples off the learned manifold under strong guidance. We analyse guidance through the continuity equation and show its effect decomposes into a divergence term and a score-parallel term defined invariantly across parameterisations. We prove the divergence term blows up structurally as sampling approaches the data manifold, motivating a time-dependent schedule alongside score-parallel attenuation. The resulting plug-and-play rule, Adaptive Manifold Guidance (AdaMaG), bounds both terms at no additional inference cost. Finally, we show that most empirical heuristics for reducing saturation or improving generation quality correspond directly to the two terms in our decomposition. Across image generation benchmarks, AdaMaG improves realism, reduces hallucinations, and induces controlled desaturation in high-guidance regimes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes guidance in diffusion and flow-based generative models through the continuity equation, decomposing its effect into an invariant divergence term and a score-parallel term. It proves that the divergence term blows up structurally near the data manifold, motivating a time-dependent schedule combined with score-parallel attenuation. This leads to the plug-and-play Adaptive Manifold Guidance (AdaMaG) rule, which the authors claim bounds both terms at no extra inference cost. The work also maps common empirical heuristics to the two terms in the decomposition and reports benchmark improvements in realism, reduced hallucinations, and controlled desaturation for image generation.
Significance. If the decomposition and blow-up proof hold under the paper's assumptions, the result provides a principled, geometry-aware alternative to heuristic guidance methods like CFG that better respects probability conservation. The invariant formulation across parameterizations and the explicit link to existing heuristics are useful contributions. The no-additional-cost claim and plug-and-play nature would make adoption straightforward if the discretization concerns are addressed. The reported benchmark gains suggest practical value for visual synthesis, but the overall significance depends on verifying the continuous-flow analysis against real sampling trajectories.
major comments (2)
- The central proof that the divergence term blows up structurally as the trajectory approaches the data manifold (stated in the abstract and developed in the analysis) assumes the generative sampling trajectory obeys the continuity equation exactly as a continuous probability flow. This needs explicit justification against the finite-step numerical integration (Euler, Heun, or higher-order solvers) actually used in sampling, because local truncation errors near the manifold can become comparable to or exceed the claimed structural divergence and thereby weaken both the motivation for the time-dependent schedule and the claim that AdaMaG bounds the terms without additional cost.
- The experimental claims of improved realism and reduced hallucinations rest on benchmarks whose implementation details, exact AdaMaG schedule, ablation of the two terms, and comparison to strong baselines are not fully specified in the provided material. Without these, it is unclear whether the gains are attributable to the proposed decomposition or to other implementation choices.
minor comments (2)
- A table or explicit list mapping the most common empirical heuristics (saturation reduction, quality improvements, etc.) to the divergence and score-parallel terms would make the correspondence claim easier to verify.
- Notation for the divergence and score-parallel terms should be introduced with a single clear definition early in the analysis section to avoid any ambiguity when the time-dependent schedule is later applied.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and describe the revisions that will be incorporated to strengthen the manuscript.
read point-by-point responses
-
Referee: The central proof that the divergence term blows up structurally as the trajectory approaches the data manifold (stated in the abstract and developed in the analysis) assumes the generative sampling trajectory obeys the continuity equation exactly as a continuous probability flow. This needs explicit justification against the finite-step numerical integration (Euler, Heun, or higher-order solvers) actually used in sampling, because local truncation errors near the manifold can become comparable to or exceed the claimed structural divergence and thereby weaken both the motivation for the time-dependent schedule and the claim that AdaMaG bounds the terms without additional cost.
Authors: We agree that the continuous-flow analysis requires explicit bridging to discrete sampling. The structural divergence is a geometric consequence of the manifold that remains dominant even under the small local truncation errors of standard ODE solvers, because the sampling trajectory must still converge to the data manifold. In the revised manuscript we will add a dedicated subsection that (i) recalls standard local error bounds for Euler/Heun integrators, (ii) shows analytically that the divergence term grows faster than these truncation errors near the manifold, and (iii) reports empirical measurements of the divergence term along actual discrete trajectories. These additions will reinforce both the motivation for the time-dependent schedule and the claim that AdaMaG incurs no extra cost. revision: yes
-
Referee: The experimental claims of improved realism and reduced hallucinations rest on benchmarks whose implementation details, exact AdaMaG schedule, ablation of the two terms, and comparison to strong baselines are not fully specified in the provided material. Without these, it is unclear whether the gains are attributable to the proposed decomposition or to other implementation choices.
Authors: We accept that the experimental section must be more transparent. The revised manuscript will include a comprehensive appendix containing: the precise functional form and hyper-parameters of the AdaMaG schedule used in every experiment, complete implementation details and random seeds for all benchmarks, full ablations that isolate the divergence term from the score-parallel term, and head-to-head comparisons against strong baselines (CFG at multiple scales, other guidance variants). These additions will make clear that the reported gains in realism and hallucination reduction are directly attributable to the probability-conserving decomposition. revision: yes
Circularity Check
Derivation from continuity equation is self-contained with no reduction to inputs by construction
full rationale
The paper's central derivation applies the standard continuity equation to decompose guidance into an invariant divergence term and score-parallel term, then proves the divergence blows up structurally near the manifold. This follows directly from the PDE without defining any quantity in terms of the claimed result, without fitting parameters to data subsets and relabeling them as predictions, and without load-bearing self-citations or imported uniqueness theorems. The resulting AdaMaG rule is motivated by this analysis rather than presupposing it, and the paper remains self-contained against external mathematical benchmarks for the continuity equation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Generative sampling obeys the continuity equation in the chosen parameterization.
Reference graph
Works this paper leans on
-
[1]
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Qinsheng Zhang, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, et al. ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers.arXiv preprint arXiv:2211.01324,
work page internal anchor Pith review Pith/arXiv arXiv
-
[2]
Hyunmin Cho, Donghoon Ahn, Susung Hong, Jee Eun Kim, Seungryong Kim, and Kyong Hwan Jin. Tag: Tangential amplifying guidance for hallucination-resistant diffusion sampling.arXiv preprint arXiv:2510.04533,
-
[3]
Chen, B., Martí Monsó, D., Du, Y ., Simchowitz, M., Tedrake, R., and Sitzmann, V
Hyungjin Chung, Jeongsol Kim, Geon Yeong Park, Hyelin Nam, and Jong Chul Ye. Cfg++: Manifold- constrained classifier free guidance for diffusion models.arXiv preprint arXiv:2406.08070,
-
[4]
Classifier-free diffusion guidance
Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance. InNeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications,
work page 2021
-
[5]
Entropy rectifying guidance for diffusion and flow models
Tariq Berrada Ifriqi, Adriana Romero-Soriano, Michal Drozdzal, Jakob Verbeek, and Karteek Ala- hari. Entropy rectifying guidance for diffusion and flow models. InNeurIPS 2025-Thirty-ninth Conference on Neural Information Processing Systems,
work page 2025
-
[6]
Frame guidance: Training-free guidance for frame-level control in video diffusion models
Sangwon Jang, Taekyung Ki, Jaehyeong Jo, Jaehong Yoon, Soo Ye Kim, Zhe Lin, and Sung Ju Hwang. Frame guidance: Training-free guidance for frame-level control in video diffusion models. arXiv preprint arXiv:2506.07177,
-
[7]
Flow Matching for Generative Modeling
Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling.arXiv preprint arXiv:2210.02747,
work page internal anchor Pith review Pith/arXiv arXiv
-
[8]
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealistic image generation and editing with text-guided diffusion models.arXiv preprint arXiv:2112.10741,
work page internal anchor Pith review Pith/arXiv arXiv
-
[9]
Rectified-cfg++ for flow based models.arXiv preprint arXiv:2510.07631, 2025
Shreshth Saini, Shashank Gupta, and Alan C Bovik. Rectified-cfg++ for flow based models.arXiv preprint arXiv:2510.07631,
-
[10]
Wan: Open and Advanced Large-Scale Video Generative Models
Team Wan, Ang Wang, Baole Ai, Bin Wen, Chaojie Mao, Chen-Wei Xie, Di Chen, Feiwu Yu, Haiming Zhao, Jianxiao Yang, et al. Wan: Open and advanced large-scale video generative models. arXiv preprint arXiv:2503.20314,
work page internal anchor Pith review Pith/arXiv arXiv
-
[11]
Xiaoshi Wu, Yiming Hao, Keqiang Sun, Yixiong Chen, Feng Zhu, Rui Zhao, and Hongsheng Li. Human preference score v2: A solid benchmark for evaluating human preferences of text-to-image synthesis.arXiv preprint arXiv:2306.09341,
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
Candi Zheng and Yuan Lan. Characteristic guidance: Non-linear correction for diffusion model at large guidance scale.arXiv preprint arXiv:2312.07586,
-
[13]
admits a clean structural explanation. The flow parameterisation gives an exact identity for the divergence, and the spike emerges from a posterior-covariance gap of the clean data failing to vanish at a specific dimensional rate. 13 Proposition C.1(Late-stage divergence behaviour).Let xt =α tx1 +σ tx0 under the Lipman linear schedule with x0 ∼ N(0, I) an...
work page 2024
-
[14]
in which the score-parallel flux across iso-density surfaces governs off-manifold drift. Where APG operates on a heuristic decomposition, our framework derives the same construction from probability conservation. Second, our formulation is parameterisation-invariant.Working directly with the score st = ∇x logp t rather than ˆx0 or ε, the relevant projecti...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.