Temporal Score Rescaling for Temperature Sampling in Diffusion and Flow Models
read the original abstract
We present a mechanism to steer the sampling diversity of denoising diffusion and flow matching models, allowing users to sample from a sharper or broader distribution than the training distribution. We build on the observation that these models leverage (learned) score functions of noisy data distributions for sampling and show that rescaling these allows one to effectively control a 'local' sampling temperature. Notably, this approach does not require any finetuning or alterations to training strategy, and can be applied to any off-the-shelf model and is compatible with both deterministic and stochastic samplers. We first validate our framework on toy 2D data, and then demonstrate its application for diffusion models trained across five disparate tasks -- image generation, pose estimation, depth prediction, robot manipulation, and protein design. We find that across these tasks, our approach allows sampling from sharper (or flatter) distributions, yielding performance gains e.g., depth prediction models benefit from sampling more likely depth estimates, whereas image generation models perform better when sampling a slightly flatter distribution.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Optimizing Diffusion Priors in Image Reconstruction from a Single Observation
Combining diffusion priors as a product-of-experts and optimizing exponents via Bayesian evidence maximization enables prior tuning from one observation in inverse imaging problems.
-
Continuous Adversarial Flow Models
Continuous adversarial flow models replace MSE in flow matching with adversarial training via a discriminator, improving guidance-free FID on ImageNet from 8.26 to 3.63 for SiT and similar gains for JiT and text-to-im...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.