Shortcut models enable high-quality single or few-step sampling in diffusion models with one network and training phase by conditioning on desired step size.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
CARV amortizes upstream diffusion teacher costs over noise resamples with timestep importance sampling and stratified-inverse-CDF sampling, delivering 2-3x effective compute gains in text-to-3D experiments and order-of-magnitude variance cuts in single-step distillation.
Mixing unconditional Gaussian noise with a κ-conditioned source during training of rectified flows reduces path curvature, yielding 12% better FID scores and faster sampling than standard rectified flows.
DOLLAR combines variational score and consistency distillation for few-step video generation plus latent reward optimization, reporting 82.57 VBench score and up to 278x speedup over the teacher diffusion model for 128-frame 10-second videos.
citing papers explorer
-
One Step Diffusion via Shortcut Models
Shortcut models enable high-quality single or few-step sampling in diffusion models with one network and training phase by conditioning on desired step size.
-
Variance Reduction for Expectations with Diffusion Teachers
CARV amortizes upstream diffusion teacher costs over noise resamples with timestep importance sampling and stratified-inverse-CDF sampling, delivering 2-3x effective compute gains in text-to-3D experiments and order-of-magnitude variance cuts in single-step distillation.
-
MixFlow: Mixed Source Distributions Improve Rectified Flows
Mixing unconditional Gaussian noise with a κ-conditioned source during training of rectified flows reduces path curvature, yielding 12% better FID scores and faster sampling than standard rectified flows.
-
DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization
DOLLAR combines variational score and consistency distillation for few-step video generation plus latent reward optimization, reporting 82.57 VBench score and up to 278x speedup over the teacher diffusion model for 128-frame 10-second videos.