Dimension-Uniform Discretization Analysis of Preconditioned Annealed Langevin Dynamics for Multimodal Gaussian Mixtures

arxiv: 2605.16473 · v1 · submitted 2026-05-15 · 📊 stat.ML · cs.LG· cs.NA· math.NA· math.PR

Dimension-Uniform Discretization Analysis of Preconditioned Annealed Langevin Dynamics for Multimodal Gaussian Mixtures

Lorenzo Baldassari , Josselin Garnier , Knut Solna , Maarten V. de Hoop This is my paper

Pith reviewed 2026-05-19 21:55 UTC · model grok-4.3

classification 📊 stat.ML cs.LGcs.NAmath.NAmath.PR

keywords preconditioned annealed Langevin dynamicsGaussian mixturesdimension-uniform boundsexponential integratordiscretization analysisKL divergencemultimodal samplingspectral summability

0 comments p. Extension

The pith

Exponential-integrator discretization of preconditioned annealed Langevin dynamics yields dimension-uniform KL bounds for Gaussian mixtures under spectral summability conditions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies discretization stability for preconditioned annealed Langevin dynamics when sampling multimodal Gaussian mixtures in high and infinite dimensions. Euler-Maruyama discretization of the stiff linear term imposes a stability constraint that forces the initial smoothed law to stay uniformly close to the target across dimensions. An exponential-integrator scheme integrates that stiff part exactly and, under explicit spectral summability conditions linking the smoothing covariance, component covariance spectra, and preconditioner, produces a KL bound between the discrete law and the target that remains uniform in dimension. The bound can be driven arbitrarily small, uniformly in dimension, by taking sufficient annealing time and then refining the time mesh. These conditions are milder than those required by Euler-Maruyama and explicitly allow the initial smoothed-to-target KL divergence to grow with dimension.

Core claim

Under explicit spectral summability conditions coupling the smoothing covariance, the component covariance spectra, and the preconditioner, the exponential-integrator scheme for preconditioned annealed Langevin dynamics produces a Kullback-Leibler bound that is uniform in dimension and can be made arbitrarily small uniformly in dimension by allowing enough annealing time followed by time-mesh refinement. The same conditions permit regimes in which the KL divergence between the target and the initial smoothed law diverges with dimension, showing that the stricter restrictions of Euler-Maruyama discretization are scheme-dependent rather than intrinsic to annealed Langevin dynamics.

What carries the argument

The exponential-integrator scheme that integrates the stiff linear part of the annealed score exactly, under spectral summability conditions on the smoothing covariance, component covariances, and preconditioner.

Load-bearing premise

The target is a finite Gaussian mixture whose component covariance spectra satisfy the required summability conditions together with the chosen smoothing covariance and preconditioner.

What would settle it

Run the exponential-integrator scheme in successively higher dimensions using a preconditioner and smoothing covariance that obey the summability conditions but where the initial smoothed-to-target KL grows; if the final discrete-law KL stays bounded while dimension increases, the dimension-uniform claim holds.

Figures

Figures reproduced from arXiv: 2605.16473 by Josselin Garnier, Knut Solna, Lorenzo Baldassari, Maarten V. de Hoop.

**Figure 1.** Figure 1: Left: empirical KL(ρ d ⋆∥ Law(Y d T )) versus dimension d on a log scale. EM grows rapidly, while ELP remains stable. Right: coordinate-wise variance profile at d = 50, normalized by the target marginal variance. EM is unstable in the high-frequency coordinates, while ELP remains near the target scale. The KL is estimated by kNN with k = 20; robustness to k is reported in Appendix D. 9 [PITH_FULL_IMAGE:fi… view at source ↗

**Figure 2.** Figure 2: kNN KL estimates for different choices of k. The qualitative behavior is stable across k ∈ {10, 20, 30, 50} (the main text reports k = 20): EM grows rapidly once it enters the highfrequency stiffness regime, whereas ELP remains stable. 38 [PITH_FULL_IMAGE:figures/full_fig_p038_2.png] view at source ↗

read the original abstract

Obtaining stable diffusion-based samplers in high- and infinite-dimensional settings is challenging because errors can accumulate across high-frequency coordinates and make the dynamics unstable under refinement of the finite-dimensional approximation of the underlying function-space problem. Discretization is a typical source of such errors, and preconditioning with a suitable spectral decay is one way to control their accumulation. In this paper, we study this problem for preconditioned annealed Langevin dynamics (ALD) applied to Gaussian mixtures. We first show that Euler-Maruyama (EM) discretization, by treating the stiff linear part of the annealed score with a forward Euler step, imposes a stability constraint coupling the preconditioner with the annealed covariance scale. Together with the conditions ensuring dimension-uniform control of the annealed dynamics, this constraint forces the initial smoothed law to remain uniformly close to the target across dimensions. We then consider an exponential-integrator scheme that integrates the stiff linear part of the annealed score exactly. Under explicit spectral summability conditions coupling the smoothing covariance, the component covariance spectra, and the preconditioner, we prove a dimension-uniform Kullback-Leibler (KL) bound for this scheme. This bound can be made arbitrarily small, uniformly in dimension, by allowing enough time for annealing and then refining the time mesh accordingly. Importantly, these conditions allow regimes in which the KL divergence between the target and the initial smoothed law diverges with dimension, showing that the restrictions imposed by EM are scheme-dependent rather than intrinsic to ALD.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Exponential integrators relax the dimension-uniformity restrictions on annealed Langevin for Gaussian mixtures compared to Euler-Maruyama, but only under spectral conditions whose generality for non-commuting operators remains to be checked.

read the letter

The main takeaway is that discretization choice affects whether annealed Langevin can deliver dimension-uniform KL bounds without forcing the initial smoothed law to stay close to the target. The exponential integrator achieves this under explicit spectral summability conditions on the smoothing covariance, component spectra, and preconditioner, while Euler-Maruyama does not. This distinction is the clearest new element relative to earlier ALD discretization work.

Referee Report

2 major / 1 minor

Summary. The paper analyzes discretization of preconditioned annealed Langevin dynamics for finite multimodal Gaussian mixtures. It shows that Euler-Maruyama discretization of the annealed score imposes a stability constraint that, together with dimension-uniform control of the annealed dynamics, forces the initial smoothed law to remain uniformly close to the target across dimensions. In contrast, an exponential-integrator scheme that integrates the stiff linear part exactly yields a dimension-uniform KL bound under explicit spectral summability conditions coupling the smoothing covariance, the spectra of the component covariances, and the preconditioner. This bound can be driven to zero uniformly in dimension by sufficient annealing time followed by mesh refinement, and the conditions permit regimes in which the initial KL divergence diverges with dimension.

Significance. If the central claim holds, the work establishes that the feasibility of dimension-uniform error control for annealed Langevin sampling is scheme-dependent rather than intrinsic, and that spectral summability can decouple the initial-distribution closeness requirement from the discretization error. This is a concrete advance for understanding stable high-dimensional and function-space samplers. The explicit coupling of spectra and the allowance for diverging initial KL are strengths that could guide preconditioner design.

major comments (2)

[§3 and main KL theorem] §3 (exponential-integrator analysis) and the main KL theorem: the mode-by-mode bounding after exact linear integration is derived under the stated spectral summability conditions. However, a general finite Gaussian mixture has component covariances whose eigenbases need not coincide with each other or with the chosen smoothing covariance and preconditioner. The resulting cross terms in the score and in the evolution of the KL divergence are not obviously dominated by the per-eigenvalue summability; the manuscript should either assume simultaneous diagonalizability or supply an explicit bound showing that the cross terms remain controlled under the given conditions.
[Setup and conditions paragraph] Setup and conditions paragraph (abstract and §2): the claim that the summability conditions are non-vacuous for regimes in which KL(target, initial smoothed law) diverges with dimension is stated but not verified by an explicit construction or example for a finite mixture. Without such verification, it is unclear whether the dimension-uniform bound is achievable in a practically relevant regime or remains formal.

minor comments (1)

[Notation] Notation for the spectra of the component covariances versus the smoothing covariance should be introduced once and used consistently; currently the abstract uses overlapping symbols that could be clarified in the first section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. The points raised help clarify the scope of our assumptions and strengthen the presentation of our results. We respond to each major comment below and will revise the manuscript to address them.

read point-by-point responses

Referee: [§3 and main KL theorem] §3 (exponential-integrator analysis) and the main KL theorem: the mode-by-mode bounding after exact linear integration is derived under the stated spectral summability conditions. However, a general finite Gaussian mixture has component covariances whose eigenbases need not coincide with each other or with the chosen smoothing covariance and preconditioner. The resulting cross terms in the score and in the evolution of the KL divergence are not obviously dominated by the per-eigenvalue summability; the manuscript should either assume simultaneous diagonalizability or supply an explicit bound showing that the cross terms remain controlled under the given conditions.

Authors: We thank the referee for highlighting this important technical point. Our analysis is performed in the eigenbasis of the preconditioner and smoothing covariance, with the spectral summability conditions stated with respect to the eigenvalues in that basis. In the general non-aligned case, cross terms do appear in the score and KL evolution. Bounding these terms without further assumptions would require controlling the misalignment of eigenbases, which is possible in principle via operator-norm estimates but would complicate the explicit conditions. To preserve the clarity and verifiability of the spectral conditions while focusing on the scheme-dependent nature of the stability restrictions, we will revise the manuscript to explicitly assume simultaneous diagonalizability of the component covariances, smoothing covariance, and preconditioner. This is a standard assumption in spectral analyses of sampling algorithms and does not affect the central claim. We will update the setup in §2, the analysis in §3, and the statement of the main KL theorem, and add a brief remark discussing the assumption. revision: yes
Referee: [Setup and conditions paragraph] Setup and conditions paragraph (abstract and §2): the claim that the summability conditions are non-vacuous for regimes in which KL(target, initial smoothed law) diverges with dimension is stated but not verified by an explicit construction or example for a finite mixture. Without such verification, it is unclear whether the dimension-uniform bound is achievable in a practically relevant regime or remains formal.

Authors: We agree that an explicit example would make the claim more concrete and demonstrate that the conditions are achievable in relevant regimes. In the revised manuscript we will add a concrete construction in §2. Consider a two-component isotropic Gaussian mixture in d dimensions with component covariances that are diagonal in the same basis, with eigenvalues decaying as λ_k = k^{-2} for the target components. Choose the smoothing covariance with eigenvalues μ_k = k^{-1} and a preconditioner with eigenvalues σ_k = k^{-1.5} such that the spectral summability condition ∑_k |λ_k - μ_k| / σ_k remains finite independently of d, while the KL divergence between the target and the initial smoothed law diverges logarithmically with d due to the accumulation of small-eigenvalue discrepancies. Direct computation shows that the summability holds uniformly in d and that the discretization error bound can still be driven to zero by sufficient annealing followed by mesh refinement. This example will be included with the necessary calculations to verify both the summability and the diverging initial KL. revision: yes

Circularity Check

0 steps flagged

No circularity: direct analysis under explicit assumptions

full rationale

The paper derives a dimension-uniform KL bound for the exponential-integrator discretization of preconditioned annealed Langevin dynamics by bounding high-frequency mode contributions after exact integration of the linear part, under stated spectral summability conditions on the smoothing covariance, component covariance spectra, and preconditioner. These conditions are explicit assumptions (not derived from the target result), and the bound is obtained via direct SDE analysis without reduction to fitted quantities, self-referential definitions, or load-bearing self-citations. The derivation remains self-contained against the stated assumptions and does not collapse to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard properties of Gaussian measures and linear SDEs together with the domain assumption that the target is a finite mixture whose covariances obey the summability relation with the preconditioner; no free parameters or new entities are introduced.

axioms (1)

domain assumption The target distribution is a finite Gaussian mixture whose component covariance spectra satisfy the summability conditions with the smoothing covariance and preconditioner.
Invoked to guarantee the KL bound remains finite and controllable uniformly in dimension.

pith-pipeline@v0.9.0 · 5824 in / 1372 out tokens · 50516 ms · 2026-05-19T21:55:23.428361+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages

[1]

Lorenzo Baldassari, Josselin Garnier, Knut Sølna, and Maarten V . de Hoop. Preconditioned Langevin dynamics with score-based generative models for infinite-dimensional linear Bayesian inverse problems. InProceedings of the 39th International Conference on Neural Information Processing Systems, 2025

work page 2025
[2]

Dimension- free multimodal sampling via preconditioned annealed langevin dynamics.arXiv preprint arXiv:2602.01449, 2026

Lorenzo Baldassari, Josselin Garnier, Knut Solna, and Maarten V de Hoop. Dimension- free multimodal sampling via preconditioned annealed langevin dynamics.arXiv preprint arXiv:2602.01449, 2026

work page arXiv 2026
[3]

Lorenzo Baldassari, Ali Siahkoohi, Josselin Garnier, Knut Sølna, and Maarten V . de Hoop. Conditional score-based diffusion models for Bayesian inference in infinite dimensions. In Proceedings of the 37th International Conference on Neural Information Processing Systems, 2023

work page 2023
[4]

Taming score-based diffusion priors for infinite-dimensional nonlinear inverse problems.arXiv preprint arXiv:2405.15676, 2024

Lorenzo Baldassari, Ali Siahkoohi, Josselin Garnier, Knut Solna, and Maarten V de Hoop. Taming score-based diffusion priors for infinite-dimensional nonlinear inverse problems.arXiv preprint arXiv:2405.15676, 2024

work page arXiv 2024
[5]

Geometric MCMC for infinite-dimensional inverse problems.Journal of Computational Physics, 335:327–351, 2017

Alexandros Beskos, Mark Girolami, Shiwei Lan, Patrick E Farrell, and Andrew M Stuart. Geometric MCMC for infinite-dimensional inverse problems.Journal of Computational Physics, 335:327–351, 2017

work page 2017
[6]

Coordinate-dependent diffusion in protein folding

Robert B Best and Gerhard Hummer. Coordinate-dependent diffusion in protein folding. Proceedings of the National Academy of Sciences, 107(3):1088–1093, 2010

work page 2010
[7]

Number 62

Vladimir Igorevich Bogachev.Gaussian Measures. Number 62. American Mathematical Soc., 1998

work page 1998
[8]

∞-diff: Infinite resolution diffusion with subsampled mollified states.International Conference on Learning Representations, 2024

Sam Bond-Taylor and Chris G Willcocks. ∞-diff: Infinite resolution diffusion with subsampled mollified states.International Conference on Learning Representations, 2024. 10

work page 2024
[9]

Efficient Langevin sampling with position-dependent diffusion.arXiv preprint arXiv:2501.02943, 2025

Eugen Bronasco, Benedict Leimkuhler, Dominic Phillips, and Gilles Vilmart. Efficient Langevin sampling with position-dependent diffusion.arXiv preprint arXiv:2501.02943, 2025

work page arXiv 2025
[10]

Diffusion annealed Langevin dynamics: a theoretical study.arXiv preprint arXiv:2511.10406, 2025

Patrick Cattiaux, Paula Cordero-Encinar, and Arnaud Guillin. Diffusion annealed Langevin dynamics: a theoretical study.arXiv preprint arXiv:2511.10406, 2025

work page arXiv 2025
[11]

Provable convergence and limitations of geometric tempering for Langevin dynamics.International Conference on Learning Representations, 2025

Omar Chehab, Anna Korba, Austin Stromme, and Adrien Vacher. Provable convergence and limitations of geometric tempering for Langevin dynamics.International Conference on Learning Representations, 2025

work page 2025
[12]

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions.arXiv preprint arXiv:2209.11215, 2022

Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, and Anru R Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions.arXiv preprint arXiv:2209.11215, 2022

work page arXiv 2022
[13]

Non-asymptotic analy- sis of diffusion annealed Langevin Monte Carlo for generative modelling.arXiv preprint arXiv:2502.09306, 2025

Paula Cordero-Encinar, O Deniz Akyildiz, and Andrew B Duncan. Non-asymptotic analy- sis of diffusion annealed Langevin Monte Carlo for generative modelling.arXiv preprint arXiv:2502.09306, 2025

work page arXiv 2025
[14]

MCMC methods for functions: modifying old algorithms to make them faster.Statistical Science, pages 424–446, 2013

Simon L Cotter, Gareth O Roberts, Andrew M Stuart, and David White. MCMC methods for functions: modifying old algorithms to make them faster.Statistical Science, pages 424–446, 2013

work page 2013
[15]

Dimension-independent likelihood- informed MCMC.Journal of Computational Physics, 304:109–137, 2016

Tiangang Cui, Kody JH Law, and Youssef M Marzouk. Dimension-independent likelihood- informed MCMC.Journal of Computational Physics, 304:109–137, 2016

work page 2016
[16]

Optimal Riemannian metric for Poincaré inequal- ities and how to ideally precondition Langevin dynamics.arXiv preprint arXiv:2404.02554, 2024

Tiangang Cui, Xin Tong, and Olivier Zahm. Optimal Riemannian metric for Poincaré inequal- ities and how to ideally precondition Langevin dynamics.arXiv preprint arXiv:2404.02554, 2024

work page arXiv 2024
[17]

Sparse regression learning by aggregation and langevin monte-carlo.Journal of Computer and System Sciences, 78(5):1423–1443, 2012

Arnak S Dalalyan and Alexandre B Tsybakov. Sparse regression learning by aggregation and langevin monte-carlo.Journal of Computer and System Sciences, 78(5):1423–1443, 2012

work page 2012
[18]

Spectral gap of replica exchange Langevin diffusion on mixture distributions.Stochastic Processes and their Applications, 151:451–489, 2022

Jing Dong and Xin T Tong. Spectral gap of replica exchange Langevin diffusion on mixture distributions.Stochastic Processes and their Applications, 151:451–489, 2022

work page 2022
[19]

Continuous-time functional diffusion processes.Advances in Neural Informa- tion Processing Systems, 36:37370–37400, 2023

Giulio Franzese, Giulio Corallo, Simone Rossi, Markus Heinonen, Maurizio Filippone, and Pietro Michiardi. Continuous-time functional diffusion processes.Advances in Neural Informa- tion Processing Systems, 36:37370–37400, 2023

work page 2023
[20]

Generative diffusion models in infinite dimensions: a survey.Philosophical Transactions A, 383(2299):20240322, 2025

Giulio Franzese and Pietro Michiardi. Generative diffusion models in infinite dimensions: a survey.Philosophical Transactions A, 383(2299):20240322, 2025

work page 2025
[21]

On sampling methods and annealing algorithms

Saul B Gelfand and Sanjoy K Mitter. On sampling methods and annealing algorithms. Technical report, 1990

work page 1990
[22]

MIT press Cambridge, 2016

Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio.Deep learning, volume 1. MIT press Cambridge, 2016

work page 2016
[23]

Provable benefit of annealed Langevin Monte Carlo for non-log-concave sampling.International Conference on Learning Representations, 2025

Wei Guo, Molei Tao, and Yongxin Chen. Provable benefit of annealed Langevin Monte Carlo for non-log-concave sampling.International Conference on Learning Representations, 2025

work page 2025
[24]

Multilevel diffusion: Infinite dimensional score-based diffusion models for image generation

Paul Hagemann, Sophie Mildenberger, Lars Ruthotto, Gabriele Steidl, and Nicole Tianjiao Yang. Multilevel diffusion: Infinite dimensional score-based diffusion models for image generation. SIAM Journal on Mathematics of Data Science, 7(3):1337–1366, 2025

work page 2025
[25]

Spectral gaps for a Metropolis– Hastings algorithm in infinite dimensions.The Annals of Applied Probability, 24:2455–2490, 2014

Martin Hairer, Andrew M Stuart, and Sebastian J V ollmer. Spectral gaps for a Metropolis– Hastings algorithm in infinite dimensions.The Annals of Applied Probability, 24:2455–2490, 2014

work page 2014
[26]

Gerhard Hummer. Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations.New Journal of Physics, 7(1):34–34, 2005

work page 2005
[27]

Diffusion generative models in infinite dimensions.International Conference on Artificial Intelligence and Statistics, 2023

Gavin Kerrigan, Justin Ley, and Padhraic Smyth. Diffusion generative models in infinite dimensions.International Conference on Artificial Intelligence and Statistics, 2023. 11

work page 2023
[28]

Optimization by simulated annealing

Scott Kirkpatrick, C Daniel Gelatt Jr, and Mario P Vecchi. Optimization by simulated annealing. Science, 220(4598):671–680, 1983

work page 1983
[29]

Kloeden and Eckhard Platen.Numerical Solution of Stochastic Differential Equations

Peter E. Kloeden and Eckhard Platen.Numerical Solution of Stochastic Differential Equations. Springer, Berlin, 1992

work page 1992
[30]

A stochastic exponential euler scheme for simulation of stiff biochemical reaction systems.BIT Numerical Mathematics, 54(4):1067–1085, 2014

Yoshio Komori and Kevin Burrage. A stochastic exponential euler scheme for simulation of stiff biochemical reaction systems.BIT Numerical Mathematics, 54(4):1067–1085, 2014

work page 2014
[31]

Optimizing the diffusion coefficient of overdamped Langevin dynamics.arXiv preprint arXiv:2404.12087, 2024

Tony Lelièvre, Grigorios A Pavliotis, Geneviève Robin, Régis Santet, and Gabriel Stoltz. Optimizing the diffusion coefficient of overdamped Langevin dynamics.arXiv preprint arXiv:2404.12087, 2024

work page arXiv 2024
[32]

Improving sampling by modifying the effective diffusion.Journal of Computational Physics, 541:114313, 2025

Tony Lelièvre, Régis Santet, and Gabriel Stoltz. Improving sampling by modifying the effective diffusion.Journal of Computational Physics, 541:114313, 2025

work page 2025
[33]

Score-based diffusion models in function space.Journal of Machine Learning Research, 26(158):1–62, 2025

Jae Hyun Lim, Nikola B Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzade- nesheli, Jean Kossaifi, Vikram V oleti, Jiaming Song, Karsten Kreis, Jan Kautz, et al. Score-based diffusion models in function space.Journal of Machine Learning Research, 26(158):1–62, 2025

work page 2025
[34]

Sampling can be faster than optimization.Proceedings of the National Academy of Sciences, 116(42):20881– 20885, 2019

Yi-An Ma, Yuansi Chen, Chi Jin, Nicolas Flammarion, and Michael I Jordan. Sampling can be faster than optimization.Proceedings of the National Academy of Sciences, 116(42):20881– 20885, 2019

work page 2019
[35]

Simulated tempering: a new Monte Carlo scheme.EPL (Europhysics Letters), 19(6):451–458, 1992

Enzo Marinari and Giorgio Parisi. Simulated tempering: a new Monte Carlo scheme.EPL (Europhysics Letters), 19(6):451–458, 1992

work page 1992
[36]

John Wiley & Sons, 2000

Geoffrey J McLachlan and David Peel.Finite mixture models. John Wiley & Sons, 2000

work page 2000
[37]

Sampling from multimodal distributions using tempered transitions.Statistics and computing, 6(4):353–366, 1996

Radford M Neal. Sampling from multimodal distributions using tempered transitions.Statistics and computing, 6(4):353–366, 1996

work page 1996
[38]

Annealed importance sampling.Statistics and Computing, 11(2):125–139, 2001

Radford M Neal. Annealed importance sampling.Statistics and Computing, 11(2):125–139, 2001

work page 2001
[39]

Kullback-Leibler divergence estimation of continuous distributions

Fernando Pérez-Cruz. Kullback-Leibler divergence estimation of continuous distributions. In 2008 IEEE international symposium on information theory, pages 1666–1670. IEEE, 2008

work page 2008
[40]

Infinite-dimensional diffusion models.Journal of Machine Learning Research, 25(414):1–52, 2024

Jakiw Pidstrigach, Youssef Marzouk, Sebastian Reich, and Sven Wang. Infinite-dimensional diffusion models.Journal of Machine Learning Research, 25(414):1–52, 2024

work page 2024
[41]

Improving the convergence of reversible samplers.Journal of Statistical Physics, 164(3):472–494, 2016

Luc Rey-Bellet and Konstantinos Spiliopoulos. Improving the convergence of reversible samplers.Journal of Statistical Physics, 164(3):472–494, 2016

work page 2016
[42]

Optimal scaling for various Metropolis-Hastings algorithms.Statistical science, 16(4):351–367, 2001

Gareth O Roberts and Jeffrey S Rosenthal. Optimal scaling for various Metropolis-Hastings algorithms.Statistical science, 16(4):351–367, 2001

work page 2001
[43]

Langevin diffusions and Metropolis-Hastings algorithms

Gareth O Roberts and Osnat Stramer. Langevin diffusions and Metropolis-Hastings algorithms. Methodology and computing in applied probability, 4(4):337–357, 2002

work page 2002
[44]

Poincaré and log–sobolev inequalities for mixtures.Entropy, 21(1):89, 2019

André Schlichting. Poincaré and log–sobolev inequalities for mixtures.Entropy, 21(1):89, 2019

work page 2019
[45]

The convergence and ms stability of exponential Euler method for semilinear stochastic differential equations.Abstract and Applied Analysis, 2012:350407, 2012

Chunmei Shi, Yu Xiao, and Chiping Zhang. The convergence and ms stability of exponential Euler method for semilinear stochastic differential equations.Abstract and Applied Analysis, 2012:350407, 2012

work page 2012
[46]

Generative modeling by estimating gradients of the data distribution.Advances in Neural Information Processing Systems, 32, 2019

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution.Advances in Neural Information Processing Systems, 32, 2019

work page 2019
[47]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. Advances in Neural Information Processing Systems, 33:12438–12448, 2020. 12

work page 2020
[48]

Inverse problems: a Bayesian perspective.Acta numerica, 19:451–559, 2010

Andrew M Stuart. Inverse problems: a Bayesian perspective.Acta numerica, 19:451–559, 2010

work page 2010
[49]

Divergence estimation for multidimen- sional densities via k-nearest-neighbor distances.IEEE Transactions on Information Theory, 55(5):2392–2405, 2009

Qing Wang, Sanjeev R Kulkarni, and Sergio Verdú. Divergence estimation for multidimen- sional densities via k-nearest-neighbor distances.IEEE Transactions on Information Theory, 55(5):2392–2405, 2009

work page 2009
[50]

Fast sampling of diffusion models with exponential integrator.International Conference on Learning Representations, 2023

Qinsheng Zhang and Yongxin Chen. Fast sampling of diffusion models with exponential integrator.International Conference on Learning Representations, 2023

work page 2023
[51]

Annealed Langevin dynamics for massive MIMO detection.IEEE Transactions on Wireless Communications, 22(6):3762–3776, 2022

Nicolas Zilberstein, Chris Dick, Rahman Doost-Mohammady, Ashutosh Sabharwal, and Santi- ago Segarra. Annealed Langevin dynamics for massive MIMO detection.IEEE Transactions on Wireless Communications, 22(6):3762–3776, 2022

work page 2022
[52]

Solving linear inverse problems using higher-order annealed Langevin diffusion.IEEE Transactions on Signal Processing, 72:492–505, 2024

Nicolas Zilberstein, Ashutosh Sabharwal, and Santiago Segarra. Solving linear inverse problems using higher-order annealed Langevin diffusion.IEEE Transactions on Signal Processing, 72:492–505, 2024. 13 A Proofs of Section 3 A.1 Proof of Proposition 3.2 Since the two mixture components differ only in the first coordinate, the annealed path is πd t = 1 2 N...

work page 2024
[53]

1 vi,t,j − (xj −m ij)2 v2 i,t,j # . Therefore ζ d i,t(x)−ζ d ℓ,t(x) = 1 2T dX j=1 λj

= KL(ν⋆,1 ∥ν 0,1) + dX j=2 F(r j). SinceP j≥2 r2 j <∞, we haver j →0, so for all sufficiently largejalsor j ≤1, and then F(r j)≤ r2 j 4 . Hence X j≥2 F(r j)<∞, and the first-coordinate contribution is a fixed constant independent ofd. Therefore sup d≥1 KL(πd ⋆ ∥π d 0)<∞. This proves the claim. 15 A.2 Proof of Proposition 3.3 By joint convexity of relative...

work page
[54]

In the example, σ1j =σ j =j −6, σ 2j =σ j +δ j, δ 1 = 0, δ j =j −12 (j≥2)

+ 1 2 N(m d 2,Σ d 2), ρ d 0 =ρ d ⋆ ∗ N(0, C d). In the example, σ1j =σ j =j −6, σ 2j =σ j +δ j, δ 1 = 0, δ j =j −12 (j≥2). Hence σj =σ j =j −6, σj =σ j +δ j, σj −σ j =δ j. Moreover, m1 =a, mj = 0 (j≥2). Define ∆mj := sup i,ℓ∈I |mij −m ℓj|. Hence ∆m1 = 2a,∆m j = 0 (j≥2). We first verify the summability assumptions (9)–(15). Since λj =j −6, γ j =j −4, δ j =...

work page

[1] [1]

Lorenzo Baldassari, Josselin Garnier, Knut Sølna, and Maarten V . de Hoop. Preconditioned Langevin dynamics with score-based generative models for infinite-dimensional linear Bayesian inverse problems. InProceedings of the 39th International Conference on Neural Information Processing Systems, 2025

work page 2025

[2] [2]

Dimension- free multimodal sampling via preconditioned annealed langevin dynamics.arXiv preprint arXiv:2602.01449, 2026

Lorenzo Baldassari, Josselin Garnier, Knut Solna, and Maarten V de Hoop. Dimension- free multimodal sampling via preconditioned annealed langevin dynamics.arXiv preprint arXiv:2602.01449, 2026

work page arXiv 2026

[3] [3]

Lorenzo Baldassari, Ali Siahkoohi, Josselin Garnier, Knut Sølna, and Maarten V . de Hoop. Conditional score-based diffusion models for Bayesian inference in infinite dimensions. In Proceedings of the 37th International Conference on Neural Information Processing Systems, 2023

work page 2023

[4] [4]

Taming score-based diffusion priors for infinite-dimensional nonlinear inverse problems.arXiv preprint arXiv:2405.15676, 2024

Lorenzo Baldassari, Ali Siahkoohi, Josselin Garnier, Knut Solna, and Maarten V de Hoop. Taming score-based diffusion priors for infinite-dimensional nonlinear inverse problems.arXiv preprint arXiv:2405.15676, 2024

work page arXiv 2024

[5] [5]

Geometric MCMC for infinite-dimensional inverse problems.Journal of Computational Physics, 335:327–351, 2017

Alexandros Beskos, Mark Girolami, Shiwei Lan, Patrick E Farrell, and Andrew M Stuart. Geometric MCMC for infinite-dimensional inverse problems.Journal of Computational Physics, 335:327–351, 2017

work page 2017

[6] [6]

Coordinate-dependent diffusion in protein folding

Robert B Best and Gerhard Hummer. Coordinate-dependent diffusion in protein folding. Proceedings of the National Academy of Sciences, 107(3):1088–1093, 2010

work page 2010

[7] [7]

Number 62

Vladimir Igorevich Bogachev.Gaussian Measures. Number 62. American Mathematical Soc., 1998

work page 1998

[8] [8]

∞-diff: Infinite resolution diffusion with subsampled mollified states.International Conference on Learning Representations, 2024

Sam Bond-Taylor and Chris G Willcocks. ∞-diff: Infinite resolution diffusion with subsampled mollified states.International Conference on Learning Representations, 2024. 10

work page 2024

[9] [9]

Efficient Langevin sampling with position-dependent diffusion.arXiv preprint arXiv:2501.02943, 2025

Eugen Bronasco, Benedict Leimkuhler, Dominic Phillips, and Gilles Vilmart. Efficient Langevin sampling with position-dependent diffusion.arXiv preprint arXiv:2501.02943, 2025

work page arXiv 2025

[10] [10]

Diffusion annealed Langevin dynamics: a theoretical study.arXiv preprint arXiv:2511.10406, 2025

Patrick Cattiaux, Paula Cordero-Encinar, and Arnaud Guillin. Diffusion annealed Langevin dynamics: a theoretical study.arXiv preprint arXiv:2511.10406, 2025

work page arXiv 2025

[11] [11]

Provable convergence and limitations of geometric tempering for Langevin dynamics.International Conference on Learning Representations, 2025

Omar Chehab, Anna Korba, Austin Stromme, and Adrien Vacher. Provable convergence and limitations of geometric tempering for Langevin dynamics.International Conference on Learning Representations, 2025

work page 2025

[12] [12]

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions.arXiv preprint arXiv:2209.11215, 2022

Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, and Anru R Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions.arXiv preprint arXiv:2209.11215, 2022

work page arXiv 2022

[13] [13]

Non-asymptotic analy- sis of diffusion annealed Langevin Monte Carlo for generative modelling.arXiv preprint arXiv:2502.09306, 2025

Paula Cordero-Encinar, O Deniz Akyildiz, and Andrew B Duncan. Non-asymptotic analy- sis of diffusion annealed Langevin Monte Carlo for generative modelling.arXiv preprint arXiv:2502.09306, 2025

work page arXiv 2025

[14] [14]

MCMC methods for functions: modifying old algorithms to make them faster.Statistical Science, pages 424–446, 2013

Simon L Cotter, Gareth O Roberts, Andrew M Stuart, and David White. MCMC methods for functions: modifying old algorithms to make them faster.Statistical Science, pages 424–446, 2013

work page 2013

[15] [15]

Dimension-independent likelihood- informed MCMC.Journal of Computational Physics, 304:109–137, 2016

Tiangang Cui, Kody JH Law, and Youssef M Marzouk. Dimension-independent likelihood- informed MCMC.Journal of Computational Physics, 304:109–137, 2016

work page 2016

[16] [16]

Optimal Riemannian metric for Poincaré inequal- ities and how to ideally precondition Langevin dynamics.arXiv preprint arXiv:2404.02554, 2024

Tiangang Cui, Xin Tong, and Olivier Zahm. Optimal Riemannian metric for Poincaré inequal- ities and how to ideally precondition Langevin dynamics.arXiv preprint arXiv:2404.02554, 2024

work page arXiv 2024

[17] [17]

Sparse regression learning by aggregation and langevin monte-carlo.Journal of Computer and System Sciences, 78(5):1423–1443, 2012

Arnak S Dalalyan and Alexandre B Tsybakov. Sparse regression learning by aggregation and langevin monte-carlo.Journal of Computer and System Sciences, 78(5):1423–1443, 2012

work page 2012

[18] [18]

Spectral gap of replica exchange Langevin diffusion on mixture distributions.Stochastic Processes and their Applications, 151:451–489, 2022

Jing Dong and Xin T Tong. Spectral gap of replica exchange Langevin diffusion on mixture distributions.Stochastic Processes and their Applications, 151:451–489, 2022

work page 2022

[19] [19]

Continuous-time functional diffusion processes.Advances in Neural Informa- tion Processing Systems, 36:37370–37400, 2023

Giulio Franzese, Giulio Corallo, Simone Rossi, Markus Heinonen, Maurizio Filippone, and Pietro Michiardi. Continuous-time functional diffusion processes.Advances in Neural Informa- tion Processing Systems, 36:37370–37400, 2023

work page 2023

[20] [20]

Generative diffusion models in infinite dimensions: a survey.Philosophical Transactions A, 383(2299):20240322, 2025

Giulio Franzese and Pietro Michiardi. Generative diffusion models in infinite dimensions: a survey.Philosophical Transactions A, 383(2299):20240322, 2025

work page 2025

[21] [21]

On sampling methods and annealing algorithms

Saul B Gelfand and Sanjoy K Mitter. On sampling methods and annealing algorithms. Technical report, 1990

work page 1990

[22] [22]

MIT press Cambridge, 2016

Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio.Deep learning, volume 1. MIT press Cambridge, 2016

work page 2016

[23] [23]

Provable benefit of annealed Langevin Monte Carlo for non-log-concave sampling.International Conference on Learning Representations, 2025

Wei Guo, Molei Tao, and Yongxin Chen. Provable benefit of annealed Langevin Monte Carlo for non-log-concave sampling.International Conference on Learning Representations, 2025

work page 2025

[24] [24]

Multilevel diffusion: Infinite dimensional score-based diffusion models for image generation

Paul Hagemann, Sophie Mildenberger, Lars Ruthotto, Gabriele Steidl, and Nicole Tianjiao Yang. Multilevel diffusion: Infinite dimensional score-based diffusion models for image generation. SIAM Journal on Mathematics of Data Science, 7(3):1337–1366, 2025

work page 2025

[25] [25]

Spectral gaps for a Metropolis– Hastings algorithm in infinite dimensions.The Annals of Applied Probability, 24:2455–2490, 2014

Martin Hairer, Andrew M Stuart, and Sebastian J V ollmer. Spectral gaps for a Metropolis– Hastings algorithm in infinite dimensions.The Annals of Applied Probability, 24:2455–2490, 2014

work page 2014

[26] [26]

Gerhard Hummer. Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations.New Journal of Physics, 7(1):34–34, 2005

work page 2005

[27] [27]

Diffusion generative models in infinite dimensions.International Conference on Artificial Intelligence and Statistics, 2023

Gavin Kerrigan, Justin Ley, and Padhraic Smyth. Diffusion generative models in infinite dimensions.International Conference on Artificial Intelligence and Statistics, 2023. 11

work page 2023

[28] [28]

Optimization by simulated annealing

Scott Kirkpatrick, C Daniel Gelatt Jr, and Mario P Vecchi. Optimization by simulated annealing. Science, 220(4598):671–680, 1983

work page 1983

[29] [29]

Kloeden and Eckhard Platen.Numerical Solution of Stochastic Differential Equations

Peter E. Kloeden and Eckhard Platen.Numerical Solution of Stochastic Differential Equations. Springer, Berlin, 1992

work page 1992

[30] [30]

A stochastic exponential euler scheme for simulation of stiff biochemical reaction systems.BIT Numerical Mathematics, 54(4):1067–1085, 2014

Yoshio Komori and Kevin Burrage. A stochastic exponential euler scheme for simulation of stiff biochemical reaction systems.BIT Numerical Mathematics, 54(4):1067–1085, 2014

work page 2014

[31] [31]

Optimizing the diffusion coefficient of overdamped Langevin dynamics.arXiv preprint arXiv:2404.12087, 2024

Tony Lelièvre, Grigorios A Pavliotis, Geneviève Robin, Régis Santet, and Gabriel Stoltz. Optimizing the diffusion coefficient of overdamped Langevin dynamics.arXiv preprint arXiv:2404.12087, 2024

work page arXiv 2024

[32] [32]

Improving sampling by modifying the effective diffusion.Journal of Computational Physics, 541:114313, 2025

Tony Lelièvre, Régis Santet, and Gabriel Stoltz. Improving sampling by modifying the effective diffusion.Journal of Computational Physics, 541:114313, 2025

work page 2025

[33] [33]

Score-based diffusion models in function space.Journal of Machine Learning Research, 26(158):1–62, 2025

Jae Hyun Lim, Nikola B Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzade- nesheli, Jean Kossaifi, Vikram V oleti, Jiaming Song, Karsten Kreis, Jan Kautz, et al. Score-based diffusion models in function space.Journal of Machine Learning Research, 26(158):1–62, 2025

work page 2025

[34] [34]

Sampling can be faster than optimization.Proceedings of the National Academy of Sciences, 116(42):20881– 20885, 2019

Yi-An Ma, Yuansi Chen, Chi Jin, Nicolas Flammarion, and Michael I Jordan. Sampling can be faster than optimization.Proceedings of the National Academy of Sciences, 116(42):20881– 20885, 2019

work page 2019

[35] [35]

Simulated tempering: a new Monte Carlo scheme.EPL (Europhysics Letters), 19(6):451–458, 1992

Enzo Marinari and Giorgio Parisi. Simulated tempering: a new Monte Carlo scheme.EPL (Europhysics Letters), 19(6):451–458, 1992

work page 1992

[36] [36]

John Wiley & Sons, 2000

Geoffrey J McLachlan and David Peel.Finite mixture models. John Wiley & Sons, 2000

work page 2000

[37] [37]

Sampling from multimodal distributions using tempered transitions.Statistics and computing, 6(4):353–366, 1996

Radford M Neal. Sampling from multimodal distributions using tempered transitions.Statistics and computing, 6(4):353–366, 1996

work page 1996

[38] [38]

Annealed importance sampling.Statistics and Computing, 11(2):125–139, 2001

Radford M Neal. Annealed importance sampling.Statistics and Computing, 11(2):125–139, 2001

work page 2001

[39] [39]

Kullback-Leibler divergence estimation of continuous distributions

Fernando Pérez-Cruz. Kullback-Leibler divergence estimation of continuous distributions. In 2008 IEEE international symposium on information theory, pages 1666–1670. IEEE, 2008

work page 2008

[40] [40]

Infinite-dimensional diffusion models.Journal of Machine Learning Research, 25(414):1–52, 2024

Jakiw Pidstrigach, Youssef Marzouk, Sebastian Reich, and Sven Wang. Infinite-dimensional diffusion models.Journal of Machine Learning Research, 25(414):1–52, 2024

work page 2024

[41] [41]

Improving the convergence of reversible samplers.Journal of Statistical Physics, 164(3):472–494, 2016

Luc Rey-Bellet and Konstantinos Spiliopoulos. Improving the convergence of reversible samplers.Journal of Statistical Physics, 164(3):472–494, 2016

work page 2016

[42] [42]

Optimal scaling for various Metropolis-Hastings algorithms.Statistical science, 16(4):351–367, 2001

Gareth O Roberts and Jeffrey S Rosenthal. Optimal scaling for various Metropolis-Hastings algorithms.Statistical science, 16(4):351–367, 2001

work page 2001

[43] [43]

Langevin diffusions and Metropolis-Hastings algorithms

Gareth O Roberts and Osnat Stramer. Langevin diffusions and Metropolis-Hastings algorithms. Methodology and computing in applied probability, 4(4):337–357, 2002

work page 2002

[44] [44]

Poincaré and log–sobolev inequalities for mixtures.Entropy, 21(1):89, 2019

André Schlichting. Poincaré and log–sobolev inequalities for mixtures.Entropy, 21(1):89, 2019

work page 2019

[45] [45]

The convergence and ms stability of exponential Euler method for semilinear stochastic differential equations.Abstract and Applied Analysis, 2012:350407, 2012

Chunmei Shi, Yu Xiao, and Chiping Zhang. The convergence and ms stability of exponential Euler method for semilinear stochastic differential equations.Abstract and Applied Analysis, 2012:350407, 2012

work page 2012

[46] [46]

Generative modeling by estimating gradients of the data distribution.Advances in Neural Information Processing Systems, 32, 2019

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution.Advances in Neural Information Processing Systems, 32, 2019

work page 2019

[47] [47]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. Advances in Neural Information Processing Systems, 33:12438–12448, 2020. 12

work page 2020

[48] [48]

Inverse problems: a Bayesian perspective.Acta numerica, 19:451–559, 2010

Andrew M Stuart. Inverse problems: a Bayesian perspective.Acta numerica, 19:451–559, 2010

work page 2010

[49] [49]

Divergence estimation for multidimen- sional densities via k-nearest-neighbor distances.IEEE Transactions on Information Theory, 55(5):2392–2405, 2009

Qing Wang, Sanjeev R Kulkarni, and Sergio Verdú. Divergence estimation for multidimen- sional densities via k-nearest-neighbor distances.IEEE Transactions on Information Theory, 55(5):2392–2405, 2009

work page 2009

[50] [50]

Fast sampling of diffusion models with exponential integrator.International Conference on Learning Representations, 2023

Qinsheng Zhang and Yongxin Chen. Fast sampling of diffusion models with exponential integrator.International Conference on Learning Representations, 2023

work page 2023

[51] [51]

Annealed Langevin dynamics for massive MIMO detection.IEEE Transactions on Wireless Communications, 22(6):3762–3776, 2022

Nicolas Zilberstein, Chris Dick, Rahman Doost-Mohammady, Ashutosh Sabharwal, and Santi- ago Segarra. Annealed Langevin dynamics for massive MIMO detection.IEEE Transactions on Wireless Communications, 22(6):3762–3776, 2022

work page 2022

[52] [52]

Solving linear inverse problems using higher-order annealed Langevin diffusion.IEEE Transactions on Signal Processing, 72:492–505, 2024

Nicolas Zilberstein, Ashutosh Sabharwal, and Santiago Segarra. Solving linear inverse problems using higher-order annealed Langevin diffusion.IEEE Transactions on Signal Processing, 72:492–505, 2024. 13 A Proofs of Section 3 A.1 Proof of Proposition 3.2 Since the two mixture components differ only in the first coordinate, the annealed path is πd t = 1 2 N...

work page 2024

[53] [53]

1 vi,t,j − (xj −m ij)2 v2 i,t,j # . Therefore ζ d i,t(x)−ζ d ℓ,t(x) = 1 2T dX j=1 λj

= KL(ν⋆,1 ∥ν 0,1) + dX j=2 F(r j). SinceP j≥2 r2 j <∞, we haver j →0, so for all sufficiently largejalsor j ≤1, and then F(r j)≤ r2 j 4 . Hence X j≥2 F(r j)<∞, and the first-coordinate contribution is a fixed constant independent ofd. Therefore sup d≥1 KL(πd ⋆ ∥π d 0)<∞. This proves the claim. 15 A.2 Proof of Proposition 3.3 By joint convexity of relative...

work page

[54] [54]

In the example, σ1j =σ j =j −6, σ 2j =σ j +δ j, δ 1 = 0, δ j =j −12 (j≥2)

+ 1 2 N(m d 2,Σ d 2), ρ d 0 =ρ d ⋆ ∗ N(0, C d). In the example, σ1j =σ j =j −6, σ 2j =σ j +δ j, δ 1 = 0, δ j =j −12 (j≥2). Hence σj =σ j =j −6, σj =σ j +δ j, σj −σ j =δ j. Moreover, m1 =a, mj = 0 (j≥2). Define ∆mj := sup i,ℓ∈I |mij −m ℓj|. Hence ∆m1 = 2a,∆m j = 0 (j≥2). We first verify the summability assumptions (9)–(15). Since λj =j −6, γ j =j −4, δ j =...

work page