CAB: Accelerating Flow and Diffusion Sampling via Rectification and Corrected Adams-Bashforth

Anuska Roy; Pravin Nair

arxiv: 2605.16736 · v2 · pith:E57JTNLKnew · submitted 2026-05-16 · 💻 cs.CV

CAB: Accelerating Flow and Diffusion Sampling via Rectification and Corrected Adams-Bashforth

Anuska Roy , Pravin Nair This is my paper

Pith reviewed 2026-05-20 15:56 UTC · model grok-4.3

classification 💻 cs.CV

keywords flow modelsdiffusion modelssampling accelerationAdams-Bashforthimage synthesistraining-free samplerrectified coordinateslow-step sampling

0 comments

The pith

CAB accelerates sampling in flow and diffusion models by rectifying dynamics and applying a corrected Adams-Bashforth procedure without extra training or evaluations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CAB as a training-free sampler that speeds up image generation from pretrained flow and diffusion models. It first maps the sampling process into a shared rectified coordinate system and then uses a multistep Adams-Bashforth predictor with a correction term based on prior velocity values. This setup targets better quality at low numbers of function evaluations, particularly in the 6 to 20 step range, across class-conditional and large-scale text-to-image tasks. A sympathetic reader would care because fewer steps make these generative models more usable in practical settings where computation time or resources are limited.

Core claim

CAB transforms the sampling dynamics of both flow and diffusion models into a common rectified coordinate system and then applies a multistep Adams-Bashforth predictor augmented with a correction term derived from past velocity evaluations. This procedure incurs no extra function evaluations, maintains the same algorithmic form across model types, and achieves at least third-order local truncation error along with second-order global error. Experiments on pretrained models show improved quality versus NFE trade-offs in the low-step regime while staying competitive at higher step counts.

What carries the argument

The rectified coordinate system paired with the corrected Adams-Bashforth procedure, which unifies acceleration across flow and diffusion models by enabling a single multistep solver to be applied uniformly.

If this is right

Improved sample quality at 6-20 NFEs on both class-conditional and large-scale text-to-image benchmarks.
Competitive performance against other training-free samplers when using higher step counts across most tested models.
Uniform algorithmic form that applies identically to flow and diffusion models without model-specific changes.
At least third-order local truncation error and second-order global error in the numerical integration.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

A single sampler implementation could serve multiple generative model families and thereby simplify deployment codebases.
Reduced step counts at maintained quality could support interactive or on-device image generation where latency matters.
The velocity-based correction approach might extend to accelerating other ODE-based processes outside image synthesis.

Load-bearing premise

Transforming the sampling dynamics to a common rectified coordinate system allows the same corrected Adams-Bashforth procedure to be applied uniformly to both flow and diffusion models without introducing model-specific degradation or requiring additional tuning.

What would settle it

Running CAB on a large-scale text-to-image model at exactly 10 NFEs and observing that the generated images yield worse or equal FID scores compared to a standard second-order solver such as Heun would falsify the claimed quality improvement in the low-step regime.

Figures

Figures reproduced from arXiv: 2605.16736 by Anuska Roy, Pravin Nair.

**Figure 2.** Figure 2: FID versus NFE for CIFAR-10 with VP/VE schedules and ImageNet with the VP schedule. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 2.** Figure 2: FID versus NFE for CIFAR-10 with VP/VE schedules and ImageNet with the VP schedule. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison of training-free samplers on 256 × 256 class-conditional ImageNet generation. the roles of rectification and correction, the effect of the corrector weight γ on distributional and perceptual quality, runtime, and memory cost. Additional results, such as empirical verification of Theorem 3.2, additional comparisons, and limitations, are deferred to the Appendix. Unconditional image generation on … view at source ↗

**Figure 4.** Figure 4: Comparison of training-free samplers on QWEN-Image [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 6.** Figure 6: Ablation of the CAB correction weight γ on DiT/ImageNet 256 × 256. Stronger correction improves low-NFE FID, while moderate correction yields better NIQE [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: Empirical verification of the accuracy results in Theorem 3.2 on two representative nonlinear [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of AB2/AB3 and the proposed CAB-2/CAB-3 on two representative nonlinear [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗

**Figure 8.** Figure 8: Comparison of AB2/AB3 and the proposed CAB-2/CAB-3 on two representative nonlinear [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

**Figure 9.** Figure 9: Qualitative comparison of samplers in the 8-NFE regime for DiT. Increasing the solver order [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗

**Figure 10.** Figure 10: Unconditional generation on CIFAR-10 using EDM model. [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗

**Figure 10.** Figure 10: Unconditional generation on CIFAR-10 using EDM model. [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗

**Figure 11.** Figure 11: Comparison of samplers on class-conditional ImageNet ( [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗

**Figure 11.** Figure 11: Comparison of samplers on class-conditional ImageNet ( [PITH_FULL_IMAGE:figures/full_fig_p023_11.png] view at source ↗

**Figure 12.** Figure 12: Comparison of training-free samplers on QWEN-Image [PITH_FULL_IMAGE:figures/full_fig_p025_12.png] view at source ↗

**Figure 12.** Figure 12: Comparison of training-free samplers on QWEN-Image [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗

**Figure 13.** Figure 13: Prompt: “light wind, feathers moving, she moves her gaze, 4k.” Temporal comparison of training-free samplers on HunyuanVideo-1.5 over four frames from the same generated video. CAB-2 better preserves appearance consistency and smoother motion progression, while DPM++ and STORK show stronger temporal drift and frame-to-frame variation. Frame 1 Frame 2 Frame 3 Frame 4 DPM++ STORK CAB-2 [PITH_FULL_IMAGE:fig… view at source ↗

**Figure 14.** Figure 14: Prompt: “A fluffy grey and white cat is lazily stretched out on a sunny window sill, enjoying a nap after a long day of lounging.” Temporal comparison of training-free samplers on HunyuanVideo-1.5 over four frames from the same generated video. CAB-2 preserves sharper fur texture, clearer facial structure, and more consistent window-side lighting, while DPM++ and STORK appear blurrier and show weaker appe… view at source ↗

**Figure 15.** Figure 15: Prompt: “a giraffe eating an apple.” Temporal comparison of training-free samplers on HunyuanVideo-1.5 over four frames from the same generated video. CAB-2 preserves cleaner appearance and more coherent scene context across frames, while DPM++ and STORK show weaker texture consistency and less stable background structure. (a) N = 6 (b) N = 20 (c) N = 50 [PITH_FULL_IMAGE:figures/full_fig_p027_15.png] view at source ↗

**Figure 16.** Figure 16: Trajectories produced by AB2, AB3, CAB-2, and CAB-3, compared with the reference [PITH_FULL_IMAGE:figures/full_fig_p027_16.png] view at source ↗

read the original abstract

Flow and diffusion models achieve high-fidelity, high-resolution image synthesis, but often require many function evaluations (NFEs) at sampling time. Existing acceleration methods either require additional training through distillation or rely on training-free high-order solvers, and both can degrade sample quality at low NFE budgets. We propose CAB (Corrected Adams-Bashforth), a training-free sampler that accelerates both flow and diffusion models. CAB first transforms the sampling dynamics to a common rectified coordinate system, and then applies a multistep Adams-Bashforth predictor augmented with a simple correction term based on past velocity evaluations and therefore incurs no additional NFEs. The resulting method is simple, has the same algorithmic form across model classes, and has at least third-order local truncation error and second-order global error. Experiments on pretrained flow and diffusion models, including class-conditional and large-scale text-to-image benchmarks, show that CAB improves quality-NFE trade-offs in the low-step regime of 6-20 NFEs. It also remains competitive with strong training-free samplers at higher step counts across most tested models. The official implementation is available at https://github.com/Anuska-Roy/CAB.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CAB gives a clean training-free way to boost low-NFE sampling quality for both flow and diffusion models via rectification plus a velocity correction on Adams-Bashforth, but the error-order claim needs explicit checks against diffusion schedules.

read the letter

The main point is that CAB rectifies the sampling dynamics to a shared coordinate system and then runs a multistep Adams-Bashforth predictor with a simple past-velocity correction term. This keeps the same code path for flow and diffusion models and avoids any extra training or function evaluations. The experiments report better quality-NFE curves at 6-20 steps on class-conditional and text-to-image models, while staying competitive at higher step counts, and the code is public.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CAB, a training-free sampler for accelerating both flow and diffusion models. It first rectifies sampling trajectories into a common coordinate system, then applies a multistep Corrected Adams-Bashforth predictor that uses past velocity evaluations (no extra NFEs) and is claimed to achieve at least third-order local truncation error and second-order global error. Experiments on pretrained class-conditional and large-scale text-to-image models show improved quality-NFE trade-offs for 6-20 NFEs while remaining competitive at higher step counts.

Significance. If the rectification unifies the dynamics sufficiently for the claimed error order to hold uniformly and the empirical gains are robust to fair baseline matching, CAB would provide a simple, reproducible acceleration technique applicable to both model families without distillation or per-model tuning. The public implementation at https://github.com/Anuska-Roy/CAB is a positive factor for reproducibility.

major comments (2)

[Methods] Methods section (derivation of the correction term and local truncation error): the third-order claim assumes that rectification eliminates residual nonlinear drift terms arising from diffusion variance schedules. For linear or cosine schedules, an O(Δt²) remainder may persist after any affine rectification, which would invalidate the fixed correction coefficient and reduce the actual order; no explicit Taylor expansion or verification for standard schedules is supplied to confirm the assumption.
[Experiments] Experimental section (low-NFE regime, 6-20 NFEs): the reported quality gains rely on the rectification step preserving the smoothness and bounded-derivative conditions needed for the Adams-Bashforth analysis. Without controls that isolate the rectification effect (e.g., comparing rectified vs. non-rectified CAB on the same diffusion model), it is unclear whether the gains are attributable to the claimed order or to incidental trajectory straightening.

minor comments (2)

[Abstract] The abstract states 'at least third-order local truncation error'; the precise order and the exact form of the correction coefficient should be stated explicitly with the full expansion in the main text rather than left to the appendix.
[Methods] Notation for the rectified coordinate system and the velocity field after rectification should be introduced with a clear equation early in the Methods section to avoid ambiguity when the same CAB procedure is applied to both flow and diffusion models.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. The comments help clarify the presentation of the theoretical claims and the attribution of empirical gains. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [Methods] Methods section (derivation of the correction term and local truncation error): the third-order claim assumes that rectification eliminates residual nonlinear drift terms arising from diffusion variance schedules. For linear or cosine schedules, an O(Δt²) remainder may persist after any affine rectification, which would invalidate the fixed correction coefficient and reduce the actual order; no explicit Taylor expansion or verification for standard schedules is supplied to confirm the assumption.

Authors: We appreciate the referee drawing attention to the need for a more explicit error analysis. The rectification is constructed as an affine transformation chosen to align the integrated velocity fields of flow and diffusion models into a common coordinate system in which the leading nonlinear contributions from the variance schedule are removed or pushed to higher order. Nevertheless, we agree that the manuscript would benefit from a self-contained Taylor expansion of the rectified dynamics for the linear and cosine schedules used in our experiments. In the revised version we will insert this derivation, explicitly showing the order of the residual term after rectification and confirming that the fixed correction coefficient preserves the third-order local truncation error. We will also add a short numerical check of the observed convergence rate on a simple ODE with the same schedules. revision: yes
Referee: [Experiments] Experimental section (low-NFE regime, 6-20 NFEs): the reported quality gains rely on the rectification step preserving the smoothness and bounded-derivative conditions needed for the Adams-Bashforth analysis. Without controls that isolate the rectification effect (e.g., comparing rectified vs. non-rectified CAB on the same diffusion model), it is unclear whether the gains are attributable to the claimed order or to incidental trajectory straightening.

Authors: We concur that an explicit ablation isolating the rectification step would make the source of the low-NFE improvements clearer. Although CAB is presented as an integrated procedure in which rectification is a prerequisite for applying the corrected multistep rule, we will add a controlled comparison in the revised experimental section: on the same pretrained diffusion models we will report results for (i) the full CAB pipeline, (ii) the corrected Adams-Bashforth predictor applied directly in the original coordinates (i.e., without rectification), and (iii) a standard Adams-Bashforth baseline. These additional curves will allow readers to separate the contribution of rectification from the multistep correction itself. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation applies standard numerical methods after coordinate transformation

full rationale

The paper presents CAB as the composition of a rectification step that maps flow and diffusion dynamics into a shared coordinate system followed by a corrected Adams-Bashforth multistep integrator whose local truncation error order is taken from the classical analysis of Adams-Bashforth schemes. No central quantity is defined in terms of itself, no parameter is fitted inside the paper and then relabeled as a prediction, and no load-bearing premise rests on a self-citation whose validity is presupposed. The algorithmic form and error claims are therefore independent of the paper's own experimental outputs or prior author-specific results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that rectification produces a common dynamics amenable to the same multistep solver across model families, plus standard numerical analysis results for Adams-Bashforth methods.

axioms (1)

domain assumption Sampling dynamics of flow and diffusion models can be transformed to a common rectified coordinate system without loss of the target distribution or introduction of model-specific artifacts.
This premise is required to justify applying the identical CAB procedure to both model classes.

pith-pipeline@v0.9.0 · 5737 in / 1312 out tokens · 59041 ms · 2026-05-20T15:56:43.013518+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 2 internal anchors

[1]

Denoising diffusion probabilistic models.Proc

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Proc. Advances in neural information processing systems, 33:6840–6851, 2020

work page 2020
[2]

Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling.Proc. International Conference on Learning Representations, 2023

work page 2023
[3]

Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, and David J. Fleet. Video diffusion models.Proc. Advances in neural information processing systems, 35:8633–8646, 2022

work page 2022
[4]

Diffusion model-based image editing: A survey

Yi Huang, Jiancheng Huang, Yifan Liu, Mingfu Yan, Jiaxi Lv, Jianzhuang Liu, Wei Xiong, He Zhang, Shifeng Chen, and Liangliang Cao. Diffusion model-based image editing: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(6):4409–4437, 2025

work page 2025
[5]

Lee, Jonathan Ho, Tim Salimans, David J

Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, and Mohammad Norouzi. Palette: Image-to-image diffusion models.NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021

work page 2021
[6]

Repaint: Inpainting using denoising diffusion probabilistic models.Proc

Andreas Lugmayr, Martin Danelljan, Andrés Romero, Fisher Yu, Radu Timofte, and Luc Van Gool. Repaint: Inpainting using denoising diffusion probabilistic models.Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

work page 2022
[7]

Diffusion models for image restoration and enhancement: A comprehensive survey

Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wenjun Zeng, Xinchao Wang, and Zhibo Chen. Diffusion models for image restoration and enhancement: A comprehensive survey. International Journal of Computer Vision, 133(11):8078–8108, 2025

work page 2025
[8]

Survey of video diffusion models: Foundations, implementations, and applications.Transactions on Machine Learning Research, 2025

Yimu Wang, Xuye Liu, Wei Pang, Li Ma, Shuai Yuan, Paul Debevec, and Ning Yu. Survey of video diffusion models: Foundations, implementations, and applications.Transactions on Machine Learning Research, 2025. ISSN 2835-8856

work page 2025
[9]

Score-based generative modeling through stochastic differential equations.Proc

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations.Proc. International Conference on Learning Representations, 2021

work page 2021
[10]

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Proc

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Proc. Advances in neural information processing systems, 35:5775–5787, 2022

work page 2022
[11]

Denoising diffusion implicit models.Proc

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.Proc. International Conference on Learning Representations, 2021

work page 2021
[12]

Unipc: A unified predictor- corrector framework for fast sampling of diffusion models.Proc

Wenliang Zhao, Lujia Bai, Yongming Rao, Jie Zhou, and Jiwen Lu. Unipc: A unified predictor- corrector framework for fast sampling of diffusion models.Proc. Advances in Neural Informa- tion Processing Systems, 36:49842–49869, 2023

work page 2023
[13]

Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models.Machine Intelligence Re- search, pages 1–22, 2025

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models.Machine Intelligence Re- search, pages 1–22, 2025. 10

work page 2025
[14]

Pseudo numerical methods for diffusion models on manifolds.Proc

Luping Liu, Yi Ren, Zhijie Lin, and Zhou Zhao. Pseudo numerical methods for diffusion models on manifolds.Proc. International Conference on Learning Representations, 2022

work page 2022
[15]

Fast sampling of diffusion models with exponential integrator.Proc

Qinsheng Zhang and Yongxin Chen. Fast sampling of diffusion models with exponential integrator.Proc. International Conference on Learning Representations, 2023

work page 2023
[16]

Dpm-solver-v3: Improved diffusion ode solver with empirical model statistics.Proc

Kaiwen Zheng, Cheng Lu, Jianfei Chen, and Jun Zhu. Dpm-solver-v3: Improved diffusion ode solver with empirical model statistics.Proc. Advances in Neural Information Processing Systems, 36:55502–55542, 2023

work page 2023
[17]

Bertozzi, and Ernest K

Zheng Tan, Weizhen Wang, Andrea L. Bertozzi, and Ernest K. Ryu. STORK: Faster diffusion and flow matching sampling by resolving both stiffness and structure-dependence.Proc. International Conference on Learning Representations, 2026

work page 2026
[18]

Qwen-Image Technical Report

Chenfei Wu, Jiahao Li, Jingren Zhou, Junyang Lin, Kaiyuan Gao, Kun Yan, Sheng-ming Yin, Shuai Bai, Xiao Xu, Yilei Chen, et al. Qwen-image technical report.arXiv preprint arXiv:2508.02324, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[19]

Scalable diffusion models with transformers.Proc

William Peebles and Saining Xie. Scalable diffusion models with transformers.Proc. IEEE/CVF International Conference on Computer Vision, pages 4195–4205, 2023

work page 2023
[20]

Neural ordinary differential equations.Proc

Ricky TQ Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differential equations.Proc. Advances in Neural Information Processing Systems, 31, 2018

work page 2018
[21]

Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, and David Duvenaud. Ffjord: Free-form continuous dynamics for scalable reversible generative models.Proc. International Conference on Learning Representations, 2019

work page 2019
[22]

Elucidating the design space of diffusion-based generative models.Proc

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models.Proc. Advances in neural information processing systems, 35:26565–26577, 2022

work page 2022
[23]

Flow straight and fast: Learning to generate and transfer data with rectified flow.Proc

Xingchao Liu, Chengyue Gong, and qiang liu. Flow straight and fast: Learning to generate and transfer data with rectified flow.Proc. International Conference on Learning Representations, 2023

work page 2023
[24]

Variational diffusion models

Diederik Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. Proc. Advances in neural information processing systems, 34:21696–21707, 2021

work page 2021
[25]

Progressive distillation for fast sampling of diffusion models

Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. Proc. International Conference on Learning Representations, 2022

work page 2022
[26]

Scale-wise distillation of diffusion models.Proc

Nikita Starodubcev, Ilya Drobyshevskiy, Denis Kuznedelev, Artem Babenko, and Dmitry Baranchuk. Scale-wise distillation of diffusion models.Proc. International Conference on Learning Representations, 2026

work page 2026
[27]

Consistency models.Proc

Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models.Proc. International Conference on Machine Learning, pages 32211–32252, 2023

work page 2023
[28]

Simplifying, stabilizing and scaling continuous-time consistency models.Proc

Cheng Lu and Yang Song. Simplifying, stabilizing and scaling continuous-time consistency models.Proc. International Conference on Learning Representations, 2025

work page 2025
[29]

Instaflow: One step is enough for high-quality diffusion-based text-to-image generation.Proc

Xingchao Liu, Xiwen Zhang, Jianzhu Ma, Jian Peng, and Qiang liu. Instaflow: One step is enough for high-quality diffusion-based text-to-image generation.Proc. International Confer- ence on Learning Representations, 2024

work page 2024
[30]

Freeman, and Taesung Park

Tianwei Yin, Michaël Gharbi, Richard Zhang, Eli Shechtman, Frédo Durand, William T. Freeman, and Taesung Park. One-step diffusion with distribution matching distillation.Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6613–6623, 2024

work page 2024
[31]

Sana-sprint: One-step diffusion with continuous-time consistency distillation.Proc

Junsong Chen, Shuchen Xue, Yuyang Zhao, Jincheng Yu, Sayak Paul, Junyu Chen, Han Cai, Song Han, and Enze Xie. Sana-sprint: One-step diffusion with continuous-time consistency distillation.Proc. IEEE/CVF International Conference on Computer Vision, pages 16185–16195, 2025. 11

work page 2025
[32]

Sa-solver: Stochastic adams solver for fast sampling of diffusion models.Proc

Shuchen Xue, Mingyang Yi, Weijian Luo, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, and Zhi-Ming Ma. Sa-solver: Stochastic adams solver for fast sampling of diffusion models.Proc. Advances in Neural Information Processing Systems, 36:77632–77674, 2023

work page 2023
[33]

Restart sampling for improving generative processes.Proc

Yilun Xu, Mingyang Deng, Xiang Cheng, Yonglong Tian, Ziming Liu, and Tommi Jaakkola. Restart sampling for improving generative processes.Proc. Advances in Neural Information Processing Systems, 36:76806–76838, 2023

work page 2023
[34]

Neta Shaul, Juan Perez, Ricky T. Q. Chen, Ali Thabet, Albert Pumarola, and Yaron Lipman. Bespoke solvers for generative flow models.Proc. International Conference on Learning Representations, 2024

work page 2024
[35]

Ratliff, and Sewoong Oh

Eric Frankel, Sitan Chen, Jerry Li, Pang Wei Koh, Lillian J. Ratliff, and Sewoong Oh. S4s: Solving for a fast diffusion model solver.Proc. International Conference on Machine Learning, 2025

work page 2025
[36]

Nørsett, and Gerhard Wanner.Solving Ordinary Differential Equations I: Nonstiff Problems

Ernst Hairer, Syvert P. Nørsett, and Gerhard Wanner.Solving Ordinary Differential Equations I: Nonstiff Problems. Springer, 2 edition, 1993

work page 1993
[37]

J. C. Butcher. Numerical methods for ordinary differential equations in the 20th century.Journal of Computational and Applied Mathematics, 125(1–2):1–29, 2000

work page 2000
[38]

Ascher and Linda R

Uri M. Ascher and Linda R. Petzold.Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. SIAM, 1998

work page 1998
[39]

Böhm and H

C. Böhm and H. J. Stetter. The defect correction approach.Computing, 32(1):3–22, 1984

work page 1984
[40]

Ong and Raymond J

Benjamin W. Ong and Raymond J. Spiteri. Deferred correction methods for ordinary differential equations.International Journal of Computer Mathematics, 97(1–2):378–398, 2020

work page 2020
[41]

Imagereward: Learning and evaluating human preferences for text-to-image generation

Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. Imagereward: Learning and evaluating human preferences for text-to-image generation. Proc. Advances in Neural Information Processing Systems, 36:15903–15935, 2023

work page 2023
[42]

HunyuanVideo: A Systematic Framework For Large Video Generative Models

Weijie Kong, Qi Tian, Zijian Zhang, Rox Min, Zuozhuo Dai, Jin Zhou, Jiangfeng Xiong, Xin Li, Bo Wu, Jianwei Zhang, et al. Hunyuanvideo: A systematic framework for large video generative models.arXiv preprint arXiv:2412.03603, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[43]

Evalcrafter: Benchmarking and evaluating large video generation models.Proc

Yaofang Liu, Xiaodong Cun, Xuebo Liu, Xintao Wang, Yong Zhang, Haoxin Chen, Yang Liu, Tieyong Zeng, Raymond Chan, and Ying Shan. Evalcrafter: Benchmarking and evaluating large video generation models.Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22139–22149, 2024

work page 2024
[44]

Learning transferable visual models from natural language supervision.Proc

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision.Proc. International Conference on Machine Learning, pages 8748–8763, 2021

work page 2021
[45]

Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation.Proc

Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation.Proc. International Conference on Machine Learning, pages 12888–12900, 2022

work page 2022
[46]

Two men standing side by side and smiling

Jan Verschelde. Variable step methods, 2022. Lecture notes for MCS 471, University of Illinois Chicago. A Theoretical results and additional experiments A.1 Proof of Lemma A.1 Starting from the reverse-time ODE in (3), dxt dt = ˙st st xt + ˙σt − ˙st st σt ϵθ(xt, t).(10) 12 Consider the rescaled state. yt := xt st ,so thatx t =s tyt. Differentiatingx t =s ...

work page arXiv 2022

[1] [1]

Denoising diffusion probabilistic models.Proc

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Proc. Advances in neural information processing systems, 33:6840–6851, 2020

work page 2020

[2] [2]

Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling.Proc. International Conference on Learning Representations, 2023

work page 2023

[3] [3]

Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, and David J. Fleet. Video diffusion models.Proc. Advances in neural information processing systems, 35:8633–8646, 2022

work page 2022

[4] [4]

Diffusion model-based image editing: A survey

Yi Huang, Jiancheng Huang, Yifan Liu, Mingfu Yan, Jiaxi Lv, Jianzhuang Liu, Wei Xiong, He Zhang, Shifeng Chen, and Liangliang Cao. Diffusion model-based image editing: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(6):4409–4437, 2025

work page 2025

[5] [5]

Lee, Jonathan Ho, Tim Salimans, David J

Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, and Mohammad Norouzi. Palette: Image-to-image diffusion models.NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021

work page 2021

[6] [6]

Repaint: Inpainting using denoising diffusion probabilistic models.Proc

Andreas Lugmayr, Martin Danelljan, Andrés Romero, Fisher Yu, Radu Timofte, and Luc Van Gool. Repaint: Inpainting using denoising diffusion probabilistic models.Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

work page 2022

[7] [7]

Diffusion models for image restoration and enhancement: A comprehensive survey

Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wenjun Zeng, Xinchao Wang, and Zhibo Chen. Diffusion models for image restoration and enhancement: A comprehensive survey. International Journal of Computer Vision, 133(11):8078–8108, 2025

work page 2025

[8] [8]

Survey of video diffusion models: Foundations, implementations, and applications.Transactions on Machine Learning Research, 2025

Yimu Wang, Xuye Liu, Wei Pang, Li Ma, Shuai Yuan, Paul Debevec, and Ning Yu. Survey of video diffusion models: Foundations, implementations, and applications.Transactions on Machine Learning Research, 2025. ISSN 2835-8856

work page 2025

[9] [9]

Score-based generative modeling through stochastic differential equations.Proc

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations.Proc. International Conference on Learning Representations, 2021

work page 2021

[10] [10]

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Proc

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Proc. Advances in neural information processing systems, 35:5775–5787, 2022

work page 2022

[11] [11]

Denoising diffusion implicit models.Proc

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.Proc. International Conference on Learning Representations, 2021

work page 2021

[12] [12]

Unipc: A unified predictor- corrector framework for fast sampling of diffusion models.Proc

Wenliang Zhao, Lujia Bai, Yongming Rao, Jie Zhou, and Jiwen Lu. Unipc: A unified predictor- corrector framework for fast sampling of diffusion models.Proc. Advances in Neural Informa- tion Processing Systems, 36:49842–49869, 2023

work page 2023

[13] [13]

Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models.Machine Intelligence Re- search, pages 1–22, 2025

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models.Machine Intelligence Re- search, pages 1–22, 2025. 10

work page 2025

[14] [14]

Pseudo numerical methods for diffusion models on manifolds.Proc

Luping Liu, Yi Ren, Zhijie Lin, and Zhou Zhao. Pseudo numerical methods for diffusion models on manifolds.Proc. International Conference on Learning Representations, 2022

work page 2022

[15] [15]

Fast sampling of diffusion models with exponential integrator.Proc

Qinsheng Zhang and Yongxin Chen. Fast sampling of diffusion models with exponential integrator.Proc. International Conference on Learning Representations, 2023

work page 2023

[16] [16]

Dpm-solver-v3: Improved diffusion ode solver with empirical model statistics.Proc

Kaiwen Zheng, Cheng Lu, Jianfei Chen, and Jun Zhu. Dpm-solver-v3: Improved diffusion ode solver with empirical model statistics.Proc. Advances in Neural Information Processing Systems, 36:55502–55542, 2023

work page 2023

[17] [17]

Bertozzi, and Ernest K

Zheng Tan, Weizhen Wang, Andrea L. Bertozzi, and Ernest K. Ryu. STORK: Faster diffusion and flow matching sampling by resolving both stiffness and structure-dependence.Proc. International Conference on Learning Representations, 2026

work page 2026

[18] [18]

Qwen-Image Technical Report

Chenfei Wu, Jiahao Li, Jingren Zhou, Junyang Lin, Kaiyuan Gao, Kun Yan, Sheng-ming Yin, Shuai Bai, Xiao Xu, Yilei Chen, et al. Qwen-image technical report.arXiv preprint arXiv:2508.02324, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[19] [19]

Scalable diffusion models with transformers.Proc

William Peebles and Saining Xie. Scalable diffusion models with transformers.Proc. IEEE/CVF International Conference on Computer Vision, pages 4195–4205, 2023

work page 2023

[20] [20]

Neural ordinary differential equations.Proc

Ricky TQ Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differential equations.Proc. Advances in Neural Information Processing Systems, 31, 2018

work page 2018

[21] [21]

Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, and David Duvenaud. Ffjord: Free-form continuous dynamics for scalable reversible generative models.Proc. International Conference on Learning Representations, 2019

work page 2019

[22] [22]

Elucidating the design space of diffusion-based generative models.Proc

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models.Proc. Advances in neural information processing systems, 35:26565–26577, 2022

work page 2022

[23] [23]

Flow straight and fast: Learning to generate and transfer data with rectified flow.Proc

Xingchao Liu, Chengyue Gong, and qiang liu. Flow straight and fast: Learning to generate and transfer data with rectified flow.Proc. International Conference on Learning Representations, 2023

work page 2023

[24] [24]

Variational diffusion models

Diederik Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. Proc. Advances in neural information processing systems, 34:21696–21707, 2021

work page 2021

[25] [25]

Progressive distillation for fast sampling of diffusion models

Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. Proc. International Conference on Learning Representations, 2022

work page 2022

[26] [26]

Scale-wise distillation of diffusion models.Proc

Nikita Starodubcev, Ilya Drobyshevskiy, Denis Kuznedelev, Artem Babenko, and Dmitry Baranchuk. Scale-wise distillation of diffusion models.Proc. International Conference on Learning Representations, 2026

work page 2026

[27] [27]

Consistency models.Proc

Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models.Proc. International Conference on Machine Learning, pages 32211–32252, 2023

work page 2023

[28] [28]

Simplifying, stabilizing and scaling continuous-time consistency models.Proc

Cheng Lu and Yang Song. Simplifying, stabilizing and scaling continuous-time consistency models.Proc. International Conference on Learning Representations, 2025

work page 2025

[29] [29]

Instaflow: One step is enough for high-quality diffusion-based text-to-image generation.Proc

Xingchao Liu, Xiwen Zhang, Jianzhu Ma, Jian Peng, and Qiang liu. Instaflow: One step is enough for high-quality diffusion-based text-to-image generation.Proc. International Confer- ence on Learning Representations, 2024

work page 2024

[30] [30]

Freeman, and Taesung Park

Tianwei Yin, Michaël Gharbi, Richard Zhang, Eli Shechtman, Frédo Durand, William T. Freeman, and Taesung Park. One-step diffusion with distribution matching distillation.Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6613–6623, 2024

work page 2024

[31] [31]

Sana-sprint: One-step diffusion with continuous-time consistency distillation.Proc

Junsong Chen, Shuchen Xue, Yuyang Zhao, Jincheng Yu, Sayak Paul, Junyu Chen, Han Cai, Song Han, and Enze Xie. Sana-sprint: One-step diffusion with continuous-time consistency distillation.Proc. IEEE/CVF International Conference on Computer Vision, pages 16185–16195, 2025. 11

work page 2025

[32] [32]

Sa-solver: Stochastic adams solver for fast sampling of diffusion models.Proc

Shuchen Xue, Mingyang Yi, Weijian Luo, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, and Zhi-Ming Ma. Sa-solver: Stochastic adams solver for fast sampling of diffusion models.Proc. Advances in Neural Information Processing Systems, 36:77632–77674, 2023

work page 2023

[33] [33]

Restart sampling for improving generative processes.Proc

Yilun Xu, Mingyang Deng, Xiang Cheng, Yonglong Tian, Ziming Liu, and Tommi Jaakkola. Restart sampling for improving generative processes.Proc. Advances in Neural Information Processing Systems, 36:76806–76838, 2023

work page 2023

[34] [34]

Neta Shaul, Juan Perez, Ricky T. Q. Chen, Ali Thabet, Albert Pumarola, and Yaron Lipman. Bespoke solvers for generative flow models.Proc. International Conference on Learning Representations, 2024

work page 2024

[35] [35]

Ratliff, and Sewoong Oh

Eric Frankel, Sitan Chen, Jerry Li, Pang Wei Koh, Lillian J. Ratliff, and Sewoong Oh. S4s: Solving for a fast diffusion model solver.Proc. International Conference on Machine Learning, 2025

work page 2025

[36] [36]

Nørsett, and Gerhard Wanner.Solving Ordinary Differential Equations I: Nonstiff Problems

Ernst Hairer, Syvert P. Nørsett, and Gerhard Wanner.Solving Ordinary Differential Equations I: Nonstiff Problems. Springer, 2 edition, 1993

work page 1993

[37] [37]

J. C. Butcher. Numerical methods for ordinary differential equations in the 20th century.Journal of Computational and Applied Mathematics, 125(1–2):1–29, 2000

work page 2000

[38] [38]

Ascher and Linda R

Uri M. Ascher and Linda R. Petzold.Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. SIAM, 1998

work page 1998

[39] [39]

Böhm and H

C. Böhm and H. J. Stetter. The defect correction approach.Computing, 32(1):3–22, 1984

work page 1984

[40] [40]

Ong and Raymond J

Benjamin W. Ong and Raymond J. Spiteri. Deferred correction methods for ordinary differential equations.International Journal of Computer Mathematics, 97(1–2):378–398, 2020

work page 2020

[41] [41]

Imagereward: Learning and evaluating human preferences for text-to-image generation

Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. Imagereward: Learning and evaluating human preferences for text-to-image generation. Proc. Advances in Neural Information Processing Systems, 36:15903–15935, 2023

work page 2023

[42] [42]

HunyuanVideo: A Systematic Framework For Large Video Generative Models

Weijie Kong, Qi Tian, Zijian Zhang, Rox Min, Zuozhuo Dai, Jin Zhou, Jiangfeng Xiong, Xin Li, Bo Wu, Jianwei Zhang, et al. Hunyuanvideo: A systematic framework for large video generative models.arXiv preprint arXiv:2412.03603, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[43] [43]

Evalcrafter: Benchmarking and evaluating large video generation models.Proc

Yaofang Liu, Xiaodong Cun, Xuebo Liu, Xintao Wang, Yong Zhang, Haoxin Chen, Yang Liu, Tieyong Zeng, Raymond Chan, and Ying Shan. Evalcrafter: Benchmarking and evaluating large video generation models.Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22139–22149, 2024

work page 2024

[44] [44]

Learning transferable visual models from natural language supervision.Proc

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision.Proc. International Conference on Machine Learning, pages 8748–8763, 2021

work page 2021

[45] [45]

Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation.Proc

Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation.Proc. International Conference on Machine Learning, pages 12888–12900, 2022

work page 2022

[46] [46]

Two men standing side by side and smiling

Jan Verschelde. Variable step methods, 2022. Lecture notes for MCS 471, University of Illinois Chicago. A Theoretical results and additional experiments A.1 Proof of Lemma A.1 Starting from the reverse-time ODE in (3), dxt dt = ˙st st xt + ˙σt − ˙st st σt ϵθ(xt, t).(10) 12 Consider the rescaled state. yt := xt st ,so thatx t =s tyt. Differentiatingx t =s ...

work page arXiv 2022