arxiv: 2605.03712 · v1 · submitted 2026-05-05 · 📊 stat.ML · cs.LG

Recognition: unknown

Tempered Guided Diffusion

Andreas Makris , Paul Fearnhead , Chris Nemeth

Authors on Pith no claims yet

Pith reviewed 2026-05-07 12:56 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords tempered guided diffusionsequential Monte Carlodiffusion modelsconditional samplingtraining-freeparticle approximationinverse problemsposterior consistency

0 comments

The pith

Tempered Guided Diffusion uses annealed SMC to produce consistent particle approximations to posteriors from diffusion priors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing training-free conditional diffusion samplers often waste effort on poor early decisions that later steps cannot fix. Tempered Guided Diffusion reframes the task as annealed sequential Monte Carlo targeting tempered posteriors over the clean signal, treating noisy diffusion states only as auxiliary variables for proposing reconstructions. Particles are reweighted by incremental likelihood ratios, resampled, and propagated across noise levels so that computation concentrates on trajectories plausible under both prior and observation. Under idealized exact-reconstruction assumptions the particle system converges to the true posterior as the number of particles grows. An accelerated variant prunes to one trajectory after initial exploration to improve wall-clock efficiency on costly tasks.

Core claim

TGD is an annealed sequential Monte Carlo framework that targets tempered posterior distributions over the clean signal by using noisy diffusion states as auxiliary variables. Reconstructions are proposed at each noise level, reweighted by incremental likelihood ratios, resampled, and propagated to the next level. Under idealized exact-reconstruction assumptions the resulting particle approximation is consistent for the posterior as the number of particles tends to infinity. The accelerated version (A-TGD) retains early particle exploration but prunes to a single high-likelihood trajectory partway through sampling to reduce cost.

What carries the argument

Annealed sequential Monte Carlo with tempering, in which noisy diffusion states serve as auxiliary variables for proposing reconstructions that are then reweighted and propagated by incremental likelihood ratios across noise levels.

If this is right

The particle approximation converges to the true posterior under the idealized assumptions as the number of particles grows.
A-TGD retains early exploration benefits while reducing later computation to a single path for expensive reconstructions.
Experiments demonstrate improved posterior approximation quality and better speed-quality tradeoffs than independent multi-trajectory guided diffusion on both 2D and image inverse problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same annealed-SMC structure could be transferred to other generative priors that admit a natural noise schedule.
Adaptive choice of tempering levels based on effective sample size might further improve robustness when the observation model is misspecified.
Viewing diffusion sampling as particle propagation suggests natural extensions for handling multimodal conditionals by maintaining diversity longer.

Load-bearing premise

Exact reconstruction from noisy states is possible at every step and the diffusion prior remains appropriate when the target distribution is tempered.

What would settle it

In the controlled two-dimensional inverse problem where the true posterior is known exactly, the total variation distance or moment error between the TGD particle approximation and the true posterior should decrease toward zero as the number of particles increases.

Figures

Figures reproduced from arXiv: 2605.03712 by Andreas Makris, Chris Nemeth, Paul Fearnhead.

**Figure 1.** Figure 1: Motivation for A-TGD. Independent trajectories select only after completion; A-TGD explores early, prunes by likelihood, and completes one trajectory. dates, even after some have become unlikely to satisfy the observation. This suggests that the main difficulty is not only the amount of computation, but how it is allocated across candidate trajectories. This perspective is consistent with a broader theme i… view at source ↗

**Figure 2.** Figure 2: Controlled 2D inverse problem. Left: prior context and posterior zoom for a representative test condition. Orange crosses denote the final TGD particles with N = 64; black markers denote the ground-truth clean sample and the sign-ambiguous candidates induced by the absolute-value observation. Right: SWD as a function of particle count. Bands denote ±1 standard error over 10 test conditions. TGD improves ra… view at source ↗

**Figure 3.** Figure 3: Speed-quality tradeoff on FFHQ phase retrieval. (a) LPIPS as a function of wall-clock time per image for DPS, DAPS (N = 1), DAPS (N = 4), and A-TGD. (b) Speedup of A-TGD over DAPS (N = 4) to reach fixed LPIPS thresholds. Observation Ground Truth A-TGD (ours) DAPS (N=1) DAPS (N=4) DPS LPIPS: 0.094 view at source ↗

**Figure 4.** Figure 4: Qualitative reconstructions on FFHQ. We show one inpainting example and one phase retrieval example. A-TGD starts with four particles and prunes to one trajectory, while DAPS (N = 4) runs four independent trajectories and selects the final reconstruction with lowest measurement error. remains poor even as the number of function evaluations increases, while DAPS generally improves with runtime, especially f… view at source ↗

**Figure 5.** Figure 5: Settings under which existing training-free conditional samplers are recovered as special view at source ↗

**Figure 6.** Figure 6: Likelihood-tempering schedules for the EDM discretization with 256 steps. The uniform view at source ↗

**Figure 7.** Figure 7: FFHQ phase retrieval examples. 30 view at source ↗

**Figure 8.** Figure 8: FFHQ inpainting examples. Observation Ground Truth A-TGD (ours) DAPS (N=1) DAPS (N=4) DPS LPIPS: 0.099 view at source ↗

**Figure 9.** Figure 9: ImageNet phase retrieval examples. 31 view at source ↗

**Figure 10.** Figure 10: ImageNet inpainting examples. 32 view at source ↗

read the original abstract

Training-free conditional diffusion provides a flexible alternative to task-specific conditional model training, but existing samplers often allocate computation inefficiently: independent guided trajectories can vary widely in quality, and additional function evaluations along a single trajectory may not recover from poor early decisions. We propose Tempered Guided Diffusion (TGD), an annealed sequential Monte Carlo framework for training-free conditional sampling with diffusion priors. TGD targets tempered posterior distributions over the clean signal, using noisy diffusion states only as auxiliary variables for proposing reconstructions and propagating particles. Particles are reweighted by incremental likelihood ratios, resampled, and propagated across noise levels, concentrating computation on trajectories plausible under both the prior and observation. Under idealized exact-reconstruction assumptions, full TGD yields a consistent particle approximation to the posterior as the number of particles grows. For expensive reconstruction tasks, Accelerated TGD (A-TGD) retains early particle exploration but prunes to a single high-likelihood trajectory partway through sampling. Experiments on a controlled two-dimensional inverse problem and image inverse problems show improved posterior approximation and favorable wall-clock speed-quality tradeoffs over independent multi-trajectory baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TGD applies annealed SMC to diffusion guidance by tempering posteriors over clean signals with auxiliary noisy states, yielding consistency under exact-reconstruction assumptions and better empirical tradeoffs than independent trajectories.

read the letter

The one or two things to know: This paper introduces Tempered Guided Diffusion, an annealed sequential Monte Carlo method for training-free conditional sampling with diffusion models. It targets tempered posteriors over the clean signal using noisy states as auxiliaries, with reweighting by incremental likelihood ratios, and includes an accelerated variant that prunes to one trajectory after initial exploration. The consistency result for the particle approximation as the number of particles grows, under exact-reconstruction assumptions, is a clear theoretical step forward. The experiments on a 2D inverse problem and image tasks indicate improved posterior approximation and better wall-clock tradeoffs compared to multi-trajectory baselines. What is new is this specific combination of tempering the posterior and treating the diffusion process as auxiliary variables for particle propagation and resampling. It doesn't collapse to existing independent-trajectory guidance or standard methods. The paper does well by grounding the method in SMC principles applied to this setting and by offering both the full TGD and the faster A-TGD option for practical use in expensive reconstruction tasks. The soft spots are around the assumptions and verification. The consistency holds only under idealized exact-reconstruction assumptions, which could be a limitation when applying it to real diffusion models where reconstruction isn't perfect. There's no detailed error analysis or convergence rates beyond the basic consistency, and the tempering of the prior needs to be checked for how it preserves the intended distribution in practice. The abstract mentions favorable results but without code or full details, reproducibility is an open question for now. This work is for people in the generative modeling community focused on conditional generation and inverse problems without retraining models. Readers who follow developments in guidance techniques or SMC in machine learning will get the most value. It deserves a serious referee because the framework is novel, the claims are conditioned properly, and the experiments support the ideas, even if more work on assumptions and implementation details would strengthen it. I recommend sending this to peer review rather than desk rejecting it.

Referee Report

1 major / 3 minor

Summary. The paper proposes Tempered Guided Diffusion (TGD), an annealed sequential Monte Carlo framework for training-free conditional sampling from diffusion priors. It targets tempered posteriors over the clean signal, treating noisy diffusion states as auxiliary variables for proposing reconstructions and propagating particles via reweighting by incremental likelihood ratios, resampling, and propagation across noise levels. Under idealized exact-reconstruction assumptions, full TGD is claimed to produce a consistent particle approximation to the posterior as the number of particles grows. An accelerated variant (A-TGD) prunes to a single high-likelihood trajectory for efficiency. Experiments on a controlled 2D inverse problem and image inverse problems report improved posterior approximation and favorable speed-quality tradeoffs versus independent multi-trajectory baselines.

Significance. If the consistency result holds under the stated assumptions, the work provides a principled SMC-based mechanism for allocating computation more efficiently in guided diffusion by focusing on plausible trajectories, which could advance training-free conditional generation methods. The explicit conditioning of the consistency claim on idealized assumptions and the use of standard SMC principles applied to tempered diffusion posteriors are strengths, as is the inclusion of controlled experiments demonstrating practical gains over baselines.

major comments (1)

The central consistency claim (abstract and theoretical section) is explicitly conditioned on idealized exact-reconstruction assumptions; the manuscript should include a concrete discussion or counterexample showing how violation of this assumption affects finite-particle behavior, as this is load-bearing for the practical interpretation of the result.

minor comments (3)

The abstract mentions 'lacks full derivation details, error analysis, or code'; if the full manuscript provides these in the theoretical section, they should be explicitly cross-referenced in the abstract for clarity.
Notation for the tempered posterior and auxiliary diffusion variables should be introduced with a clear table or diagram early in the methods section to aid readers unfamiliar with SMC applications to diffusion.
In the experiments section, the 2D inverse problem setup would benefit from an explicit statement of the observation model and noise levels used, to allow direct reproduction of the reported improvements.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive assessment and recommendation for minor revision. We address the single major comment below.

read point-by-point responses

Referee: The central consistency claim (abstract and theoretical section) is explicitly conditioned on idealized exact-reconstruction assumptions; the manuscript should include a concrete discussion or counterexample showing how violation of this assumption affects finite-particle behavior, as this is load-bearing for the practical interpretation of the result.

Authors: We agree that elaborating on the practical effects of violating the exact-reconstruction assumption strengthens the interpretation of the consistency result. In the revised manuscript we will add a dedicated paragraph (or short subsection) in the theoretical section. This will explicitly note that the consistency guarantee is asymptotic and holds only under the idealized assumption; when reconstruction is approximate (as occurs in all practical settings), finite-particle approximations can exhibit bias in the weights. We will illustrate this with a simple numerical counterexample on the controlled 2D inverse problem, replacing the exact reconstructor with a deliberately noisy approximation and showing the resulting degradation in posterior approximation quality for small particle counts, while still demonstrating that the tempered SMC procedure outperforms independent guided trajectories. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The derivation applies standard annealed SMC consistency results to a tempered posterior constructed from a diffusion prior and observation likelihood. The central consistency claim is explicitly conditioned on idealized exact-reconstruction assumptions and does not reduce any target quantity to a fitted parameter or self-referential definition internal to TGD. No load-bearing step equates a prediction to its own inputs by construction, nor does the argument rely on self-citations whose content is unverified outside the present work. The framework remains self-contained against external SMC theory.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard diffusion model properties and SMC theory without introducing new free parameters or invented entities; the consistency result explicitly invokes idealized assumptions.

axioms (2)

domain assumption Diffusion models define a valid prior over clean signals via the reverse noising process.
Invoked as the base distribution for conditional sampling throughout the abstract.
ad hoc to paper Exact reconstruction of the clean signal is possible under idealized conditions.
Explicitly required for the consistency guarantee stated in the abstract.

pith-pipeline@v0.9.0 · 5484 in / 1268 out tokens · 46801 ms · 2026-05-07T12:56:27.252578+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

63 extracted references · 19 canonical work pages · 6 internal anchors

[1]

Bilal Ahmed and Joseph G. Makin. Solving diffusion inverse problems with restart posterior sampling, 2025. URLhttps://arxiv.org/abs/2511.20705

work page arXiv 2025
[2]

Anderson

Brian D.O. Anderson. Reverse-time diffusion equation models.Stochastic Processes and their Applications, 12(3):313–326, 1982. ISSN 0304-4149. doi: https://doi.org/10.1016/ 0304-4149(82)90051-5. URL https://www.sciencedirect.com/science/article/ pii/0304414982900515

work page arXiv 1982
[3]

Universal guidance for diffusion models

Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Roni Sengupta, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Universal guidance for diffusion models. InThe Twelfth Inter- national Conference on Learning Representations, 2024. URL https://openreview.net/ forum?id=pzpWBbnwiJ

2024
[4]

Sliced and Radon Wasserstein Barycenters of Measures , journal =

Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. Sliced and radon wasserstein barycenters of measures.Journal of Mathematical Imaging and Vision, 51: 22–45, 2015. doi: 10.1007/s10851-014-0506-3. URL https://doi.org/10.1007/ s10851-014-0506-3

work page doi:10.1007/s10851-014-0506-3 2015
[5]

Markovian flow matching: Acceler- ating mcmc with continuous normalizing flows.Advances in Neural Information Processing Systems, 37:104383–104411, 2024

Alberto Cabezas, Louis Sharrock, and Christopher Nemeth. Markovian flow matching: Acceler- ating mcmc with continuous normalizing flows.Advances in Neural Information Processing Systems, 37:104383–104411, 2024

2024
[6]

An introduction to sequential monte carlo

Nicolas Chopin. An introduction to sequential monte carlo. Technical report, University of Bristol, 2004. Technical report

2004
[7]

Improving diffusion models for inverse problems using manifold constraints

Hyungjin Chung, Byeongsu Sim, Dohoon Ryu, and Jong Chul Ye. Improving diffusion models for inverse problems using manifold constraints. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors,Advances in Neural Information Processing Systems,
[8]

URLhttps://openreview.net/forum?id=nJJjv0JDJju
[9]

Diffusion Posterior Sampling for General Noisy Inverse Problems

Hyungjin Chung, Jeongsol Kim, Michael T. Mccann, Marc L. Klasky, and Jong Chul Ye. Diffusion posterior sampling for general noisy inverse problems, 2024. URL https://arxiv. org/abs/2209.14687

work page internal anchor Pith review arXiv 2024
[10]

Springer New York, New York, NY , 2004

Pierre Del Moral.Feynman-Kac Formulae, pages 47–93. Springer New York, New York, NY , 2004. ISBN 978-1-4684-9393-1. doi: 10.1007/978-1-4684-9393-1_2. URL https: //doi.org/10.1007/978-1-4684-9393-1_2

work page doi:10.1007/978-1-4684-9393-1_2 2004
[11]

Sequential monte carlo samplers.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3):411–436, 2006

Pierre Del Moral, Arnaud Doucet, and Ajay Jasra. Sequential monte carlo samplers.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3):411–436, 2006

2006
[12]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255, 2009

2009
[13]

Diffusion Models Beat GANs on Image Synthesis

Prafulla Dhariwal and Alex Nichol. Diffusion models beat gans on image synthesis, 2021. URL https://arxiv.org/abs/2105.05233. 10

work page internal anchor Pith review arXiv 2021
[14]

Diffusion posterior sampling for linear inverse problem solving: A filtering perspective

Zehao Dou and Yang Song. Diffusion posterior sampling for linear inverse problem solving: A filtering perspective. InThe Twelfth International Conference on Learning Representations,
[15]

URLhttps://openreview.net/forum?id=tplXNcHZs1
[16]

Statistics for Engineering and Information Science

Arnaud Doucet, Nando de Freitas, and Neil Gordon.Sequential Monte Carlo Methods in Practice. Statistics for Engineering and Information Science. Springer, New York, 2001

2001
[17]

Oates, and Chris Sherlock.Scalable Monte Carlo for Bayesian Learning

Paul Fearnhead, Christopher Nemeth, Chris J. Oates, and Chris Sherlock.Scalable Monte Carlo for Bayesian Learning. Institute of Mathematical Statistics Monographs. Cambridge University Press, 2025

2025
[18]

Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H

Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong,...

2021
[19]

Pot python optimal transport (version 0.9.5), 2024

Rémi Flamary, Cédric Vincent-Cuaz, Nicolas Courty, Alexandre Gramfort, Oleksii Kachaiev, Huy Quang Tran, Laurène David, Clément Bonet, Nathan Cassereau, Théo Gnassounou, Eloi Tanguy, Julie Delon, Antoine Collas, Sonia Mazelet, Laetitia Chapel, Tanguy Kerdoncuff, Xizheng Yu, Matthew Feickert, Paul Krzakala, Tianlin Liu, and Eduardo Fernandes Montesuma. Pot...

2024
[20]

REG: Rectified gradient guidance for conditional diffusion models

Zhengqi Gao, Kaiwen Zha, Tianyuan Zhang, Zihui Xue, and Duane S Boning. REG: Rectified gradient guidance for conditional diffusion models. InForty-second International Conference on Machine Learning, 2025. URLhttps://openreview.net/forum?id=uNK4ftGdnq

2025
[21]

Chapman and Hall/CRC, 1995

Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin.Bayesian data analysis. Chapman and Hall/CRC, 1995

1995
[22]

Manifold preserv- ing guided diffusion.arXiv preprint arXiv:2311.16424, 2023

Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, and Stefano Ermon. Manifold preserving guided diffusion, 2023. URLhttps://arxiv.org/abs/2311.16424

work page arXiv 2023
[23]

Classifier-Free Diffusion Guidance

Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance, 2022. URL https:// arxiv.org/abs/2207.12598

work page internal anchor Pith review arXiv 2022
[24]

Denoising Diffusion Probabilistic Models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models, 2020. URLhttps://arxiv.org/abs/2006.11239

work page internal anchor Pith review arXiv 2020
[25]

A style-based generator architecture for generative adversarial networks

Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4401–4410, 2019

2019
[26]

Elucidating the Design Space of Diffusion-Based Generative Models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models, 2022. URLhttps://arxiv.org/abs/2206.00364

work page internal anchor Pith review arXiv 2022
[27]

Guiding a diffusion model with a bad version of itself

Tero Karras, Miika Aittala, Tuomas Kynkäänniemi, Jaakko Lehtinen, Timo Aila, and Samuli Laine. Guiding a diffusion model with a bad version of itself. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview. net/forum?id=bg6fVPVs3s

2024
[28]

SNIPS: Solving noisy inverse problems stochastically.Advances in Neural Information Processing Systems, 34:21757–21769, 2021

Bahjat Kawar, Gregory Vaksman, and Michael Elad. SNIPS: Solving noisy inverse problems stochastically.Advances in Neural Information Processing Systems, 34:21757–21769, 2021

2021
[29]

Denoising diffusion restoration models

Bahjat Kawar, Michael Elad, Stefano Ermon, and Jiaming Song. Denoising diffusion restoration models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 23593–23606. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/ 2022/file/9...

work page arXiv 2022
[30]

Particle algorithms for maximum likelihood training of latent variable models

Juan Kuntz, Jen Ning Lim, and Adam M Johansen. Particle algorithms for maximum likelihood training of latent variable models. InInternational Conference on Artificial Intelligence and Statistics, pages 5134–5180. PMLR, 2023

2023
[31]

Repaint: Inpainting using denoising diffusion probabilistic models

Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, and Luc Van Gool. Repaint: Inpainting using denoising diffusion probabilistic models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

2022
[32]

A variational perspective on solving inverse problems with diffusion models.arXiv preprint arXiv:2305.04391, 2023

Morteza Mardani, Jiaming Song, Jan Kautz, and Arash Vahdat. A variational perspective on solving inverse problems with diffusion models.arXiv preprint arXiv:2305.04391, 2023

work page arXiv 2023
[33]

GibbsDDRM: A partially collapsed gibbs sampler for solving blind inverse problems with denoising diffusion restoration

Naoki Murata, Koichi Saito, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Yuki Mitsufuji, and Stefano Ermon. GibbsDDRM: A partially collapsed gibbs sampler for solving blind inverse problems with denoising diffusion restoration. InInternational Conference on Machine Learning, 2023

2023
[34]

MIT press, 2022

Kevin P Murphy.Probabilistic machine learning: an introduction. MIT press, 2022

2022
[35]

Naesseth, Fredrik Lindsten, and Thomas B

Christian A. Naesseth, Fredrik Lindsten, and Thomas B. Schön. Elements of sequential monte carlo.Found. Trends Mach. Learn., 12(3):307–392, November 2019. ISSN 1935-8237. doi: 10.1561/2200000074. URLhttps://doi.org/10.1561/2200000074

work page doi:10.1561/2200000074 2019
[36]

Sampling from multimodal distributions using tempered transitions.Statistics and computing, 6(4):353–366, 1996

Radford M Neal. Sampling from multimodal distributions using tempered transitions.Statistics and computing, 6(4):353–366, 1996

1996
[37]

Radford M. Neal. Annealed importance sampling, 1998. URL https://arxiv.org/abs/ physics/9803008

work page arXiv 1998
[38]

Pseudo- extended markov chain monte carlo.Advances in Neural Information Processing Systems, 32, 2019

Christopher Nemeth, Fredrik Lindsten, Maurizio Filippone, and James Hensman. Pseudo- extended markov chain monte carlo.Advances in Neural Information Processing Systems, 32, 2019

2019
[39]

Improved denoising diffusion probabilistic models

Alex Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. InProceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pages 8162–8171. PMLR, 2021. URL https://proceedings.mlr.press/v139/nichol21a.html

2021
[40]

Variational control for guidance in diffusion models

Kushagra Pandey, Farrin Marouf Sofian, Felix Draxler, Theofanis Karaletsos, and Stephan Mandt. Variational control for guidance in diffusion models. InForty-second Interna- tional Conference on Machine Learning, 2025. URL https://openreview.net/forum? id=Z0ffRRtOim

2025
[41]

Free hunch: Denoiser covariance estimation for diffusion models without extra costs

Severi Rissanen, Markus Heinonen, and Arno Solin. Free hunch: Denoiser covariance estimation for diffusion models without extra costs. InThe Thirteenth International Conference on Learning Representations, 2025. URLhttps://openreview.net/forum?id=4JK2XMGUc8

2025
[42]

Bayes and big data: The consensus monte carlo algorithm

Steven L Scott, Alexander W Blocker, Fernando V Bonassi, Hugh A Chipman, Edward I George, and Robert E McCulloch. Bayes and big data: The consensus monte carlo algorithm. InBig Data and Information Theory, pages 8–18. Routledge, 2022

2022
[43]

Tuning-free maximum likelihood training of latent variable models via coin betting

Louis Sharrock, Daniel Dodd, and Christopher Nemeth. Tuning-free maximum likelihood training of latent variable models via coin betting. InInternational Conference on Artificial Intelligence and Statistics, pages 1810–1818. PMLR, 2024

2024
[44]

Deep unsu- pervised learning using nonequilibrium thermodynamics

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsu- pervised learning using nonequilibrium thermodynamics. In Francis Bach and David Blei, editors,Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2256–2265, Lille, France, 07–09 Jul 2015. PM...

2015
[45]

Denoising diffusion implicit models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations, 2021. URL https://openreview. net/forum?id=St1giarCHLP. 12

2021
[46]

Pseudoinverse-guided diffusion models for inverse problems

Jiaming Song, Arash Vahdat, Morteza Mardani, and Jan Kautz. Pseudoinverse-guided diffusion models for inverse problems. InInternational Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id=9_gsMA8MRKQ

2023
[47]

Generative modeling by estimating gradients of the data distribution

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/ 2019/file/3001ef2...

2019
[48]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 12438–12448. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/ 92c3b916311a5...

2020
[49]

Score-Based Generative Modeling through Stochastic Differential Equations

Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations, 2021. URLhttps://arxiv.org/abs/2011.13456

work page internal anchor Pith review arXiv 2021
[50]

Solving inverse problems in medi- cal imaging with score-based generative models

Yang Song, Liyue Shen, Lei Xing, and Stefano Ermon. Solving inverse problems in medi- cal imaging with score-based generative models. InInternational Conference on Learning Representations, 2022. URLhttps://openreview.net/forum?id=vaRCHVj0uGI

2022
[51]

Trippe, Jason Yim, Doug Tischer, David Baker, Tamara Broderick, Regina Barzilay, and Tommi S

Brian L. Trippe, Jason Yim, Doug Tischer, David Baker, Tamara Broderick, Regina Barzilay, and Tommi S. Jaakkola. Diffusion probabilistic modeling of protein backbones in 3d for the motif- scaffolding problem. InThe Eleventh International Conference on Learning Representations,
[52]

URLhttps://openreview.net/forum?id=6TxBxqNME1Y
[53]

Swiss: A scalable markov chain monte carlo divide-and-conquer strategy.Stat, 12(1):e523, 2023

Callum Vyner, Christopher Nemeth, and Chris Sherlock. Swiss: A scalable markov chain monte carlo divide-and-conquer strategy.Stat, 12(1):e523, 2023

2023
[54]

Zero-shot image restoration using denoising diffusion null-space model

Yinhuai Wang, Jiwen Yu, and Jian Zhang. Zero-shot image restoration using denoising diffusion null-space model. InThe Eleventh International Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id=mRieQgMtNTQ

2023
[55]

Trippe, Christian A

Luhuan Wu, Brian L. Trippe, Christian A. Naesseth, David M. Blei, and John P. Cunningham. Practical and asymptotically exact conditional sampling in diffusion models, 2024. URL https://arxiv.org/abs/2306.17775

work page arXiv 2024
[56]

Tfg: Unified training-free guidance for dif- fusion models

Haotian Ye, Haowei Lin, Jiaqi Han, Minkai Xu, Sheng Liu, Yitao Liang, Jianzhu Ma, James Zou, and Stefano Ermon. Tfg: Unified training-free guidance for dif- fusion models. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Sys- tems, volume 37, pages 22370–22417. Curran A...

work page arXiv 2024
[57]

Freedom: Training- free energy-guided conditional diffusion model.Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023

Jiwen Yu, Yinhuai Wang, Chen Zhao, Bernard Ghanem, and Jian Zhang. Freedom: Training- free energy-guided conditional diffusion model.Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023

2023
[58]

Videodpo: Omni- preference alignment for video diffusion generation

Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, and Yang Song. Improving diffusion inverse problem solving with decoupled noise anneal- ing. In2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), page 20895–20905. IEEE, June 2025. doi: 10.1109/cvpr52734.2025.01946. URL http: //dx.doi.org/10.1109/CVPR52734....

work page doi:10.1109/cvpr52734.2025.01946 2025
[59]

Efros, Eli Shechtman, and Oliver Wang

Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. The unrea- sonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 586–595, 2018

2018
[60]

Denoising diffusion models for plug-and-play image restoration

Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, and Luc Van Gool. Denoising diffusion models for plug-and-play image restoration. InIEEE Conference on Computer Vision and Pattern Recognition Workshops (NTIRE), 2023. 13 A Lifted SMC Construction and Incremental Weights We derive the incremental weight update used in the stagew...

2023
[61]

, R , 0< Z r <∞ , and the tempering schedule satisfies 0≤λ R ≤λ R−1 ≤ · · · ≤λ 0 = 1

Well-defined targets.For every r= 0, . . . , R , 0< Z r <∞ , and the tempering schedule satisfies 0≤λ R ≤λ R−1 ≤ · · · ≤λ 0 = 1. 2.Initialization.λ R = 0, and the initial particles are sampled i.i.d. from the noisy prior: z(i) R ∼p sR(zR), w (i) R = 1/N. Equivalently, the initial weighted empirical measure is consistent forη R
[62]

Exact reconstruction.For each r= 0, . . . , R , the stagewise reconstruction kernel used by ideal TGD is the exact conditional law Kr(dx0 |z r) :=p λr,sr(x0 |z r,y)dx 0 = πr(x0)psr(zr |x 0) ηr(zr) dx0.(23) Equivalently, pλr,sr(x0 |z r,y)∝p sr(x0 |z r)ℓ(x0)λr .(24) 4.Bounded incremental potentials.For eachr=R, . . . ,1, the incremental potential Gr(x0) :=ℓ...
[63]

No annealing

Consistent resampling.At each outer stage, either resampling is omitted or the resampling scheme preserves consistency of weighted empirical measures: whenever the weighted particle system is consistent for a target distribution, the equally weighted resampled particles are also consistent for the same target. Standard SMC schemes [6, 33] such as multinom...

work page arXiv 2042