pith. machine review for the scientific record. sign in

arxiv: 2512.13592 · v2 · submitted 2025-12-15 · 💻 cs.LG · cs.CV

Recognition: no theorem link

Image Diffusion Preview with Consistency Solver

Authors on Pith no claims yet

Pith reviewed 2026-05-16 21:39 UTC · model grok-4.3

classification 💻 cs.LG cs.CV
keywords diffusion modelsimage generationconsistency solverreinforcement learningsampling accelerationpreview workflowlinear multistep methods
0
0 comments X

The pith

A new RL-tuned solver produces consistent high-quality previews for diffusion models using far fewer steps than standard methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Diffusion models generate images too slowly for smooth interactive use, so the paper introduces a preview-and-refine workflow: a fast low-step sampler shows a draft image for quick user judgment, with full computation only if the draft passes. Current fast solvers and distilled models often produce poor previews or drift when the final image is generated. The authors derive ConsistencySolver from general linear multistep methods, then train it as a lightweight high-order solver with reinforcement learning to improve both preview quality and preview-to-final consistency. Experiments show it matches multistep DPM-Solver FID scores with 47 percent fewer steps and cuts measured user interaction time by nearly half.

Core claim

We introduce Diffusion Preview, a paradigm employing rapid, low-step sampling to generate preliminary outputs for user evaluation, deferring full-step refinement until the preview is deemed satisfactory. We propose ConsistencySolver derived from general linear multistep methods, a lightweight, trainable high-order solver optimized via Reinforcement Learning, that enhances preview quality and consistency. Experimental results demonstrate that ConsistencySolver significantly improves generation quality and consistency in low-step scenarios, achieving FID scores on-par with Multistep DPM-Solver using 47% fewer steps, while outperforming distillation baselines. User studies indicate our approach

What carries the argument

ConsistencySolver: a trainable high-order solver obtained from general linear multistep methods and optimized by reinforcement learning to stabilize low-step diffusion sampling.

If this is right

  • Low-step previews reach FID parity with multistep DPM-Solver while using 47 percent fewer function evaluations.
  • The method outperforms existing distillation baselines on both quality and consistency metrics in the preview regime.
  • User interaction time for image generation drops by nearly 50 percent in controlled studies without loss of final quality.
  • The preview-and-refine loop becomes practical because preview failures no longer waste full computation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same RL-tuning approach could be applied to other linear multistep families to accelerate additional diffusion variants.
  • Preview consistency opens the door to interactive editing where users modify the low-step draft before full refinement.
  • If the solver generalizes, real-time creative tools could adopt diffusion backbones without sacrificing responsiveness.

Load-bearing premise

The reinforcement-learning-tuned parameters will transfer to new prompts and datasets without introducing preview artifacts or breaking consistency with full generations.

What would settle it

On a held-out prompt set drawn from a different distribution, low-step ConsistencySolver outputs show FID more than 10 points worse than the multistep baseline or fail visual consistency checks with the corresponding full-step images on more than 15 percent of cases.

Figures

Figures reproduced from arXiv: 2512.13592 by Bohyung Han, Boqing Gong, Fu-Yun Wang, Han Zhang, Hao Zhou, Liangzhe Yuan, Long Zhao, Ming-Hsuan Yang, Sanghyun Woo, Ting Liu, Yukun Zhu.

Figure 1
Figure 1. Figure 1: Overview of our Diffusion Preview framework for ef￾ficient image generation using diffusion models. Given a text prompt and a noise map, we first perform faster diffusion sam￾pling to quickly generate a preview image. The user then decides whether the result is satisfactory. If not, they may refine the prompt or change the random seed. Once satisfied, full-step diffusion sam￾pling is applied to generate th… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our RL framework for optimizing a learnable ODE solver in diffusion sampling. Given a prompt and a noise map, [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visual comparison on Stable Diffusion for text-to-image generation. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual comparison on FLUX.1-Kontext for instructional [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Workflow of the generalized learnable ODE solver [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
read the original abstract

The slow inference process of image diffusion models significantly degrades interactive user experiences. To address this, we introduce Diffusion Preview, a novel paradigm employing rapid, low-step sampling to generate preliminary outputs for user evaluation, deferring full-step refinement until the preview is deemed satisfactory. Existing acceleration methods, including training-free solvers and post-training distillation, struggle to deliver high-quality previews or ensure consistency between previews and final outputs. We propose ConsistencySolver derived from general linear multistep methods, a lightweight, trainable high-order solver optimized via Reinforcement Learning, that enhances preview quality and consistency. Experimental results demonstrate that ConsistencySolver significantly improves generation quality and consistency in low-step scenarios, making it ideal for efficient preview-and-refine workflows. Notably, it achieves FID scores on-par with Multistep DPM-Solver using 47% fewer steps, while outperforming distillation baselines. Furthermore, user studies indicate our approach reduces overall user interaction time by nearly 50% while maintaining generation quality. Code is available at https://github.com/G-U-N/consolver.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces Diffusion Preview, a workflow using low-step sampling for rapid previews in diffusion models followed by full refinement only if the preview is satisfactory. It proposes ConsistencySolver, a trainable high-order solver derived from general linear multistep methods and optimized via reinforcement learning, claiming this yields FID scores on par with Multistep DPM-Solver at 47% fewer steps, outperforms distillation baselines, and reduces user interaction time by nearly 50% in studies while maintaining consistency between preview and final output.

Significance. If the RL-optimized coefficients generalize reliably, the work could meaningfully improve interactive diffusion applications by enabling trustworthy low-step previews. The availability of code at https://github.com/G-U-N/consolver is a positive for reproducibility.

major comments (2)
  1. [Abstract] Abstract and Experimental Results: the central claim of FID parity with Multistep DPM-Solver using 47% fewer steps and reliable preview-to-final consistency rests on RL-optimized solver coefficients, yet no cross-prompt or cross-dataset ablation is reported to verify that the learned parameters do not overfit the training distribution and produce artifacts on diverse inputs.
  2. [Method] Method section on RL optimization: the reward design, exact training prompts, and consistency metric used to tune the solver coefficients are summarized without sufficient detail, which is load-bearing for assessing whether the reported gains are robust or could require full regeneration in practice.
minor comments (1)
  1. [Method] Notation for the linear multistep coefficients and the RL policy parameterization should be defined more explicitly with equations to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the thoughtful review and constructive suggestions. We have revised the manuscript to incorporate additional ablations and detailed methodological descriptions as requested. Below we provide point-by-point responses to the major comments.

read point-by-point responses
  1. Referee: [Abstract] Abstract and Experimental Results: the central claim of FID parity with Multistep DPM-Solver using 47% fewer steps and reliable preview-to-final consistency rests on RL-optimized solver coefficients, yet no cross-prompt or cross-dataset ablation is reported to verify that the learned parameters do not overfit the training distribution and produce artifacts on diverse inputs.

    Authors: We thank the referee for highlighting this important point. While our experiments were conducted on standard benchmarks like ImageNet and COCO, we recognize the value of explicit cross-dataset and cross-prompt ablations. In the revised manuscript, we have included additional experiments demonstrating that the ConsistencySolver maintains FID parity and preview consistency across a variety of prompts and datasets without introducing artifacts. These results support the robustness of the learned coefficients. revision: yes

  2. Referee: [Method] Method section on RL optimization: the reward design, exact training prompts, and consistency metric used to tune the solver coefficients are summarized without sufficient detail, which is load-bearing for assessing whether the reported gains are robust or could require full regeneration in practice.

    Authors: We agree that providing more granular details on the RL optimization process is essential for reproducibility and assessing robustness. In the updated Method section, we now include the full specification of the reward design (including the weighting of quality, consistency, and efficiency terms), the exact set of training prompts used (drawn from a curated subset of LAION-5B), and the mathematical definition of the consistency metric (based on perceptual similarity measures). We believe this addresses the concern and allows readers to fully evaluate the approach. revision: yes

Circularity Check

0 steps flagged

No significant circularity in ConsistencySolver derivation

full rationale

The paper starts from established general linear multistep methods, introduces ConsistencySolver as a trainable high-order solver, and optimizes its parameters via Reinforcement Learning on external training signals. Reported gains (FID parity with 47% fewer steps, user study time reduction) are presented as empirical outcomes from separate evaluation protocols rather than quantities that reduce by definition or construction to the fitted coefficients or to self-citations. No load-bearing step equates a prediction to its own input parameters, and the RL objective is described as independent of the final test metrics.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on adapting standard numerical methods to diffusion sampling and using RL for parameter tuning, with no new physical entities postulated.

free parameters (1)
  • solver coefficients
    High-order multistep coefficients are made trainable and optimized via RL rather than fixed analytically.
axioms (1)
  • domain assumption General linear multistep methods apply to the probability flow ODE of diffusion models
    The solver is explicitly derived from these methods for the diffusion setting.

pith-pipeline@v0.9.0 · 5504 in / 1230 out tokens · 30231 ms · 2026-05-16T21:39:31.186223+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · 10 internal anchors

  1. [1]

    Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

    Arash Ahmadian, Chris Cremer, Matthias Gall ´e, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet ¨Ust¨un, and Sara Hooker. Back to basics: Revisiting reinforce style op- timization for learning from human feedback in llms.arXiv preprint arXiv:2402.14740, 2024. 4

  2. [2]

    Analytic- dpm: an analytic estimate of the optimal reverse vari- ance in diffusion probabilistic models.arXiv preprint arXiv:2201.06503, 2022

    Fan Bao, Chongxuan Li, Jun Zhu, and Bo Zhang. Analytic- dpm: an analytic estimate of the optimal reverse vari- ance in diffusion probabilistic models.arXiv preprint arXiv:2201.06503, 2022. 2

  3. [3]

    Training Diffusion Models with Reinforcement Learning

    Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, and Sergey Levine. Training diffusion models with reinforce- ment learning.arXiv preprint arXiv:2305.13301, 2023. 4

  4. [4]

    John Wiley & Sons, 2016

    John Charles Butcher.Numerical methods for ordinary dif- ferential equations. John Wiley & Sons, 2016. 3, 4

  5. [5]

    On the trajectory regularity of ode-based diffu- sion sampling.arXiv preprint arXiv:2405.11326, 2024

    Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, and Siwei Lyu. On the trajectory regularity of ode-based diffu- sion sampling.arXiv preprint arXiv:2405.11326, 2024. 5

  6. [6]

    Diffusion models beat gans on image synthesis.NeurIPS, 2021

    Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis.NeurIPS, 2021. 1

  7. [7]

    Genie: Higher-order denoising diffusion solvers

    Tim Dockhorn, Arash Vahdat, and Karsten Kreis. Genie: Higher-order denoising diffusion solvers. InNeurIPS, 2022. 2

  8. [8]

    Unified autoregressive visual generation and understanding with continuous tokens.arXiv preprint arXiv:2503.13436, 2025

    Lijie Fan, Luming Tang, Siyang Qin, Tianhong Li, Xuan Yang, Siyuan Qiao, Andreas Steiner, Chen Sun, Yuanzhen Li, Tao Zhu, et al. Unified autoregressive visual generation and understanding with continuous tokens.arXiv preprint arXiv:2503.13436, 2025. 1

  9. [9]

    Re- inforcement learning for fine-tuning text-to-image diffusion models

    Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Moham- mad Ghavamzadeh, Kangwook Lee, and Kimin Lee. Re- inforcement learning for fine-tuning text-to-image diffusion models. InNeurIPS, 2024. 4

  10. [10]

    Geneval: An object-focused framework for evaluating text- to-image alignment

    Dhruba Ghosh, Hannaneh Hajishirzi, and Ludwig Schmidt. Geneval: An object-focused framework for evaluating text- to-image alignment. InNeurIPS, 2023. 7, 2

  11. [11]

    Generative adversarial nets

    Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. InNeurIPS,

  12. [12]

    Nørsett, and Gerhard Wanner.Solv- ing Ordinary Differential Equations I: Nonstiff Problems

    Ernst Hairer, Syvert P. Nørsett, and Gerhard Wanner.Solv- ing Ordinary Differential Equations I: Nonstiff Problems. Springer-Verlag, Berlin, 2nd edition, 1993. Chapter III: Mul- tistep methods. 3, 4

  13. [13]

    GANs trained by a two time-scale update rule converge to a local Nash equi- librium

    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. GANs trained by a two time-scale update rule converge to a local Nash equi- librium. InNeurIPS, 2017. 2, 5

  14. [14]

    Denoising diffu- sion probabilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models. InNeurIPS, 2020. 1, 2, 3

  15. [15]

    Elucidating the design space of diffusion-based generative models

    Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models. InNeurIPS, 2022. 2, 6

  16. [16]

    Distilling ode solvers of diffusion models into smaller steps

    Sanghwan Kim, Hao Tang, and Fisher Yu. Distilling ode solvers of diffusion models into smaller steps. InCVPR,

  17. [17]

    Auto-Encoding Variational Bayes

    Diederik P Kingma and Max Welling. Auto-encoding varia- tional bayes.arXiv preprint arXiv:1312.6114, 2013. 2

  18. [18]

    FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space

    Black Forest Labs, Stephen Batifol, Andreas Blattmann, Frederic Boesel, Saksham Consul, Cyril Diagne, Tim Dock- horn, Jack English, Zion English, Patrick Esser, Sumith Ku- lal, Kyle Lacey, Yam Levi, Cheng Li, Dominik Lorenz, Jonas M¨uller, Dustin Podell, Robin Rombach, Harry Saini, Axel Sauer, and Luke Smith. Flux.1 kontext: Flow matching for in-context i...

  19. [19]

    Autoregressive image generation without vec- tor quantization

    Tianhong Li, Yonglong Tian, He Li, Mingyang Deng, and Kaiming He. Autoregressive image generation without vec- tor quantization. InNeurIPS, 2024. 1

  20. [20]

    Remax: A simple, effec- tive, and efficient reinforcement learning method for aligning large language models.arXiv preprint arXiv:2310.10505,

    Ziniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, and Zhi-Quan Luo. Remax: A simple, effec- tive, and efficient reinforcement learning method for aligning large language models.arXiv preprint arXiv:2310.10505,

  21. [21]

    Microsoft coco: Common objects in context

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In ECCV, 2014. 5, 7, 2

  22. [22]

    Pseudo numerical methods for diffusion models on manifolds

    Luping Liu, Yi Ren, Zhijie Lin, and Zhou Zhao. Pseudo numerical methods for diffusion models on manifolds. In ICLR, 2022. 2, 5, 6, 1

  23. [23]

    Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps

    Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. InNeurIPS,

  24. [24]

    DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

    Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongx- uan Li, and Jun Zhu. Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models.arXiv preprint arXiv:2211.01095, 2022. 2, 5

  25. [25]

    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

    Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, and Hang Zhao. Latent consistency models: Synthesizing high- resolution images with few-step inference.arXiv preprint arXiv:2310.04378, 2023. 2, 6

  26. [26]

    Diff-instruct: A universal approach for transferring knowledge from pre-trained diffu- sion models

    Weijian Luo, Tianyang Hu, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, and Zhihua Zhang. Diff-instruct: A universal approach for transferring knowledge from pre-trained diffu- sion models. InNeurIPS, 2023. 2

  27. [27]

    Improved denoising diffusion probabilistic models

    Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. InICML, pages 8162–8171. PMLR, 2021. 2 9

  28. [28]

    DINOv2: Learning Robust Visual Features without Supervision

    Maxime Oquab et al. DINOv2: Learning robust visual fea- tures without supervision.arXiv preprint arXiv:2304.07193,

  29. [29]

    SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

    Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas M ¨uller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion mod- els for high-resolution image synthesis.arXiv preprint arXiv:2307.01952, 2023. 1

  30. [30]

    Learning transferable visual models from natural language supervision.ICML, 2021

    Alec Radford et al. Learning transferable visual models from natural language supervision.ICML, 2021. 6

  31. [31]

    High-resolution image syn- thesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨orn Ommer. High-resolution image syn- thesis with latent diffusion models. InCVPR, 2022. 1, 5

  32. [32]

    Pearson, 3rd edition,

    Timothy Sauer.Numerical Analysis. Pearson, 3rd edition,

  33. [33]

    Laion-5b: An open large-scale dataset for training next generation image-text models

    Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Worts- man, et al. Laion-5b: An open large-scale dataset for training next generation image-text models. InNeurIPS, 2022. 7, 2

  34. [34]

    Proximal Policy Optimization Algorithms

    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Rad- ford, and Oleg Klimov. Proximal policy optimization algo- rithms.arXiv preprint arXiv:1707.06347, 2017. 4

  35. [35]

    DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

    Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, YK Li, Y Wu, et al. Deepseekmath: Pushing the limits of mathe- matical reasoning in open language models.arXiv preprint arXiv:2402.03300, 2024. 4

  36. [36]

    Denois- ing diffusion implicit models

    Jiaming Song, Chenlin Meng, and Stefano Ermon. Denois- ing diffusion implicit models. InICLR, 2021. 2, 3, 5, 6, 1

  37. [37]

    Generative modeling by es- timating gradients of the data distribution

    Yang Song and Stefano Ermon. Generative modeling by es- timating gradients of the data distribution. InNeurIPS, 2019. 2, 3

  38. [38]

    Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole

    Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equa- tions. InICLR, 2021. 2, 3

  39. [39]

    Consistency models

    Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models. InICML, 2023. 2

  40. [40]

    Reinforcement learning: An introduction

    Richard S Sutton. Reinforcement learning: An introduction. A Bradford Book, 2018. 2

  41. [41]

    Rethinking the inception architec- ture for computer vision

    Christian Szegedy et al. Rethinking the inception architec- ture for computer vision. InCVPR, 2016. 6

  42. [42]

    Phased consistency models

    Fu-Yun Wang, Zhaoyang Huang, Alexander Bergman, Dazhong Shen, Peng Gao, Michael Lingelbach, Keqiang Sun, Weikang Bian, Guanglu Song, Yu Liu, et al. Phased consistency models. InNeurIPS, 2024. 6, 2

  43. [43]

    Rectified diffusion: Straightness is not your need in rectified flow

    Fu-Yun Wang, Ling Yang, Zhaoyang Huang, Mengdi Wang, and Hongsheng Li. Rectified diffusion: Straightness is not your need in rectified flow. InICLR, pages 31420–31445,

  44. [44]

    Learning fast samplers for diffusion models by differentiating through sample quality

    Daniel Watson, William Chan, Jonathan Ho, and Moham- mad Norouzi. Learning fast samplers for diffusion models by differentiating through sample quality. InICLR, 2021. 2

  45. [45]

    Editscore: Unlocking online RL for image editing via high-fidelity reward modeling.CoRR, abs/2509.23909, 2025

    Yifan Wei, Xiaoguang Hu, Xiaojuan Qi, Stephen Lin, and Hengshuang Zhao. Editscore: Unlocking online RL for image editing via high-fidelity reward modeling.CoRR, abs/2509.23909, 2025. 5

  46. [46]

    Editreward: A human-aligned re- ward model for instruction-guided image editing.CoRR, abs/2509.26346, 2025

    Keming Wu, Yifan Zhang, Tianyu Zhang, Qihang Yu, Junda Lu, and Xiaojuan Qi. Editreward: A human-aligned re- ward model for instruction-guided image editing.CoRR, abs/2509.26346, 2025. 5

  47. [47]

    Alvarez, and Ping Luo

    Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo. Segformer: Simple and ef- ficient design for semantic segmentation with transformers. InNeurIPS, 2021. 6

  48. [48]

    Depth any- thing v2

    Lihe Yang, Bingyi Kang, Zilong Huang, Zhen Zhao, Xiao- gang Xu, Jiashi Feng, and Hengshuang Zhao. Depth any- thing v2. InNeurIPS, 2024. 6

  49. [49]

    Improved distribution matching distillation for fast image synthesis

    Tianwei Yin, Micha ¨el Gharbi, Taesung Park, Richard Zhang, Eli Shechtman, Fredo Durand, and Bill Freeman. Improved distribution matching distillation for fast image synthesis. In NeurIPS, 2024. 2, 6, 7

  50. [50]

    Fast sampling of diffu- sion models with exponential integrator

    Qinsheng Zhang and Yongxin Chen. Fast sampling of diffu- sion models with exponential integrator. InICLR, 2023. 2, 6

  51. [51]

    org/abs/2302.04867

    Wenliang Zhao, Lujia Bai, Yongming Rao, Jie Zhou, and Jiwen Lu. Unipc: A unified predictor-corrector frame- work for fast sampling of diffusion models.arXiv preprint arXiv:2302.04867, 2023. 2, 6

  52. [52]

    Score identity distillation: Exponentially fast distillation of pretrained diffusion models for one-step generation

    Mingyuan Zhou, Huangjie Zheng, Zhendong Wang, Mingzhang Yin, and Hai Huang. Score identity distillation: Exponentially fast distillation of pretrained diffusion models for one-step generation. InICML, 2024. 2

  53. [53]

    A realistic photo of an astronaut on Mars wearing a white spacesuit with a giant yellow inflatable duck ring around the waist

    Zhenyu Zhou, Defang Chen, Can Wang, and Chun Chen. Fast ode-based sampling for diffusion models in around 5 steps. InCVPR, 2024. 2, 5, 6 10 Image Diffusion Preview with Consistency Solver Supplementary Material A. Common Diffusion ODE Solvers via Taylor Expansion The exact solution of Eq. (3) requires numerical approxima- tion of ∆yt→s = Z ns nt ϵ(xtn , t...