Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond
Pith reviewed 2026-05-23 03:54 UTC · model grok-4.3
The pith
Annealed importance sampling estimates the normalizing constant Z to relative error ε with Õ(d β² A² / ε⁴) oracle complexity under finite action assumptions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We derive an oracle complexity of Õ(d β² A² / ε⁴) for estimating Z within ε relative error with high probability using annealed importance sampling. This holds when there exists a curve of interpolating measures with finite action A between the target and a tractable reference. The analysis leverages Girsanov's theorem and optimal transport and does not require isoperimetric assumptions on the target distribution. To handle the large action of standard geometric interpolation, we introduce a reverse diffusion sampler algorithm, establish its complexity framework, and show empirically that it handles multimodality efficiently.
What carries the argument
the action A of a curve of probability measures interpolating between the target and reference distribution, which enters the complexity bound by controlling the variance of the estimator through Girsanov's theorem
Load-bearing premise
A curve of interpolating measures with finite action A exists between the target distribution and a tractable reference, allowing Girsanov's theorem to be applied to the underlying stochastic processes.
What would settle it
A controlled experiment on an isotropic Gaussian target with explicitly computable exact action A and exact sample requirements, checking whether the observed oracle calls scale as Õ(d β² A² / ε⁴) when d, β, A, and ε are varied.
Figures
read the original abstract
Given an unnormalized probability density $\pi\propto\mathrm{e}^{-V}$, estimating its normalizing constant $Z=\int_{\mathbb{R}^d}\mathrm{e}^{-V(x)}\mathrm{d}x$ or free energy $F=-\log Z$ is a crucial problem in Bayesian statistics, statistical mechanics, and machine learning. It is challenging especially in high dimensions or when $\pi$ is multimodal. To mitigate the high variance of conventional importance sampling estimators, annealing-based methods such as Jarzynski equality and annealed importance sampling are commonly adopted, yet their quantitative complexity guarantees remain largely unexplored. We take a first step toward a non-asymptotic analysis of annealed importance sampling. In particular, we derive an oracle complexity of $\widetilde{O}\left(\frac{d\beta^2{\mathcal{A}}^2}{\varepsilon^4}\right)$ for estimating $Z$ within $\varepsilon$ relative error with high probability, where $\beta$ is the smoothness of $V$ and $\mathcal{A}$ denotes the action of a curve of probability measures interpolating $\pi$ and a tractable reference distribution. Our analysis, leveraging Girsanov's theorem and optimal transport, does not explicitly require isoperimetric assumptions on the target distribution. Finally, to tackle the large action of the widely used geometric interpolation, we propose a new algorithm based on reverse diffusion samplers, establish a framework for analyzing its complexity, and empirically demonstrate its efficiency in tackling multimodality.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to provide the first non-asymptotic oracle complexity analysis of annealed importance sampling (AIS) for estimating the normalizing constant Z of an unnormalized density π ∝ e^{-V(x)}. It derives a bound of Õ(d β² A² / ε⁴) for achieving relative error ε with high probability, where β measures smoothness of V and A is the action of an interpolating curve of measures connecting π to a tractable reference; the analysis applies Girsanov's theorem together with optimal transport and avoids explicit isoperimetric assumptions on the target. The paper also introduces a reverse-diffusion-sampler algorithm to mitigate large action under geometric interpolation and reports empirical gains on multimodal targets.
Significance. If the central bound is valid, the result would be significant: it supplies the first quantitative non-asymptotic guarantee for a family of methods (Jarzynski equality, AIS) that are standard in Bayesian statistics, statistical mechanics, and machine learning yet previously lacked complexity statements. The combination of Girsanov and optimal transport to remove isoperimetric hypotheses is technically interesting, and the reverse-diffusion proposal directly addresses a practical bottleneck of geometric paths. The framework is reusable for other annealing schedules.
major comments (2)
- [Abstract, §1] Abstract and analysis paragraph (§1): the claimed Õ(d β² A² / ε⁴) bound is obtained by using Girsanov to produce an unbiased estimator from the Radon-Nikodym derivative between forward and reverse processes along the interpolating curve. Girsanov yields a true martingale (hence unbiasedness) only when the exponential local martingale satisfies Novikov's condition E[exp(½ ∫ |u_t|² dt)] < ∞. Finite action A (presumably ∫ E[|u_t|²] dt < ∞) controls the L² norm but does not automatically imply the required exponential moment. The manuscript asserts that no isoperimetric assumptions on the target are needed, yet provides no explicit verification or supplementary integrability condition that would guarantee Novikov for arbitrary curves with only finite A. This step is load-bearing for the unbiasedness claim and therefore for the complexity bound.
- [Abstract, §1] Abstract and §1: the stated oracle complexity is expressed in terms of the external quantity A (action of the chosen interpolating curve). The manuscript does not indicate whether a curve with A independent of the target accuracy ε can always be selected, or whether constructing such a curve (and therefore controlling A) may itself depend on ε in a manner that alters the overall complexity. Without this clarification the bound cannot be read as a fully non-asymptotic guarantee in the usual sense.
minor comments (2)
- [Abstract] The abstract states that the analysis 'does not explicitly require isoperimetric assumptions,' but the precise regularity conditions placed on the interpolating curve (e.g., moment bounds on the Girsanov kernel) should be stated explicitly in the theorem statement for clarity.
- Notation: β is used for the smoothness parameter of V; a brief reminder of its precise definition (e.g., Lipschitz constant of ∇V) would help readers who are not already familiar with the paper's conventions.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for highlighting these two technical points on the application of Girsanov's theorem and the interpretation of the oracle complexity. Both comments are addressed point-by-point below. We believe the core claims remain valid once the requested clarifications are supplied.
read point-by-point responses
-
Referee: Girsanov yields a true martingale (hence unbiasedness) only when the exponential local martingale satisfies Novikov's condition E[exp(½ ∫ |u_t|² dt)] < ∞. Finite action A controls the L² norm but does not automatically imply the required exponential moment. No explicit verification or supplementary integrability condition is provided for arbitrary curves with only finite A.
Authors: We agree that Novikov's condition is required for the exponential martingale to be a true martingale. Our analysis implicitly assumes the SDEs admit a well-defined Girsanov change of measure; finite A guarantees that the quadratic variation process is integrable in L², but does not by itself guarantee the exponential integrability. Under the standing Lipschitz and linear-growth assumptions already placed on the drift and diffusion coefficients (standard for the Langevin and reverse-diffusion processes considered), standard results in stochastic analysis (e.g., Theorem 5.1 in Karatzas & Shreve or Proposition 3.1 in Øksendal) ensure that Novikov holds locally and can be extended globally on compact time intervals. We will add a short paragraph after the statement of Girsanov's theorem (new Section 2.3) that explicitly invokes these conditions and notes that they are satisfied by the geometric and reverse-diffusion schedules analyzed later. This is a clarification rather than a change to the complexity bound itself. revision: partial
-
Referee: The stated oracle complexity is expressed in terms of the external quantity A. The manuscript does not indicate whether a curve with A independent of the target accuracy ε can always be selected, or whether constructing such a curve may itself depend on ε in a manner that alters the overall complexity.
Authors: A is the action of a fixed interpolating curve of measures chosen independently of the accuracy parameter ε; it is a property of the annealing schedule (geometric, arithmetic, or reverse-diffusion) and of the pair (π, reference). For any fixed schedule the value of A is therefore independent of ε, and the Õ(d β² A² / ε⁴) bound is fully non-asymptotic once the schedule is selected. When a practitioner chooses a schedule whose A grows with dimension or with the separation of modes, that growth appears explicitly in the complexity; the reverse-diffusion construction is introduced precisely to produce schedules whose A remains moderate. We will insert one clarifying sentence at the end of the first paragraph of Section 1 and a footnote in the complexity theorem stating that A is schedule-dependent but ε-independent. revision: yes
Circularity Check
No circularity: complexity bound parameterized by external action A using standard theorems
full rationale
The claimed oracle complexity Õ(d β² A² / ε⁴) is expressed directly in terms of the external quantity A (action of an interpolating curve of measures), with the derivation invoking Girsanov's theorem and optimal transport as independent mathematical tools. No step reduces a prediction to a fitted input, renames a known result, or relies on a load-bearing self-citation chain. The bound is not self-definitional; A is an input to the analysis rather than derived from the target result. The absence of isoperimetric assumptions is presented as a feature of the Girsanov+OT approach, with no evidence that the central claim collapses to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Girsanov's theorem applies to the stochastic processes along the annealing path
- domain assumption An interpolating curve of probability measures with finite action A exists between the target and reference
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
derive an oracle complexity of Õ(d β² A² / ε⁴) … leveraging Girsanov’s theorem and optimal transport … action A of a curve of probability measures
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Jarzynski equality … EP→ e^{-W} = e^{-ΔF}
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Sample-efficient evidence estimation of score based priors for model selection
DiME estimates model evidence for diffusion priors by integrating time-marginals from posterior sampling, enabling efficient prior selection and misfit diagnosis in ill-posed inverse problems.
Reference graph
Works this paper leans on
-
[1]
M. S. Albergo and E. Vanden-Eijnden. NETS : A non-equilibrium transport sampler. In Forty-second International Conference on Machine Learning, 2025. URL https://openreview.net/forum?id=QqGw9StPbQ
work page 2025
-
[2]
L. Ambrosio, N. Gigli, and G. Savar\'e. Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics. ETH Z\"urich. Birkh\"auser Basel, 2 edition, 2008. doi:10.1007/978-3-7643-8722-8
-
[3]
L. Ambrosio, E. Bru\'e, and D. Semola. Lectures on optimal transport, volume 130 of UNITEXT. Springer Cham, 2021. doi:10.1007/978-3-030-72162-6. URL https://link.springer.com/book/10.1007/978-3-030-72162-6
-
[4]
B. D. Anderson. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 12 0 (3): 0 313--326, 1982. ISSN 0304-4149. doi:10.1016/0304-4149(82)90051-5. URL https://www.sciencedirect.com/science/article/pii/0304414982900515
-
[5]
Sampling normalizing constants in high dimensions using inhomogeneous diffusions
C. Andrieu, J. Ridgway, and N. Whiteley. Sampling normalizing constants in high dimensions using inhomogeneous diffusions. arXiv preprint arXiv:1612.07583, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[6]
M. Arrar, F. M. Boubeta, M. E. Szretter, M. Sued, L. Boechi, and D. Rodriguez. On the accurate estimation of free energies using the Jarzynski equality. Journal of Computational Chemistry, 40 0 (4): 0 688--696, 2019. doi:10.1002/jcc.25754. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.25754
-
[7]
E. Aurell, C. Mej\' a-Monasterio, and P. Muratore-Ginanneschi. Optimal protocols and optimal transport in stochastic thermodynamics. Phys. Rev. Lett., 106: 0 250601, Jun 2011. doi:10.1103/PhysRevLett.106.250601. URL https://link.aps.org/doi/10.1103/PhysRevLett.106.250601
-
[8]
E. Aurell, K. Gaw e dzki, C. Mej\' a-Monasterio, R. Mohayaee, and P. Muratore-Ginanneschi. Refined second law of thermodynamics for fast random processes. Journal of statistical physics, 147: 0 487--505, 2012. doi:10.1007/s10955-012-0478-x
-
[9]
D. Bakry, I. Gentil, and M. Ledoux. Analysis and geometry of Markov diffusion operators , volume 103 of Grundlehren der mathematischen Wissenschaften. Springer Cham, 1 edition, 2014. doi:10.1007/978-3-319-00227-9
-
[10]
K. Balasubramanian, S. Chewi, M. A. Erdogdu, A. Salim, and S. Zhang. Towards a theory of non-log-concave sampling: First-order stationarity guarantees for Langevin Monte Carlo . In P.-L. Loh and M. Raginsky, editors, Proceedings of Thirty Fifth Conference on Learning Theory, volume 178 of Proceedings of Machine Learning Research, pages 2896--2923. PMLR, 0...
work page 2022
-
[11]
D. Blessing, J. Berner, L. Richter, and G. Neumann. Underdamped diffusion bridges with applications to sampling. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=Q1QTxFm0Is
work page 2025
-
[12]
N. Brosse, A. Durmus, and E. Moulines. Normalizing constants of log-concave densities. Electronic Journal of Statistics, 12 0 (1): 0 851 -- 889, 2018. doi:10.1214/18-EJS1411. URL https://doi.org/10.1214/18-EJS1411
-
[13]
D. Carbone, M. Hua, S. Coste, and E. Vanden-Eijnden. Efficient training of energy-based models using Jarzynski equality. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 52583--52614. Curran Associates, Inc., 2023. URL https://proceedings.neurips.cc/paper_f...
work page 2023
-
[14]
S. Chatterjee and P. Diaconis. The sample size required in importance sampling. The Annals of Applied Probability, 28 0 (2): 0 1099 -- 1135, 2018. doi:10.1214/17-AAP1326. URL https://doi.org/10.1214/17-AAP1326
-
[15]
O. Chehab, A. Hyv\"arinen, and A. Risteski. Provable benefits of annealing for estimating normalizing constants: Importance sampling, noise-contrastive estimation, and beyond. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=iWGC0Nsq9i
work page 2023
- [16]
-
[17]
J. Chemseddine, C. Wald, R. Duong, and G. Steidl. Neural sampling from Boltzmann densities: Fisher - Rao curves in the Wasserstein geometry. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=TUvg5uwdeG
work page 2025
-
[18]
H. Chen and L. Ying. Ensemble-based annealed importance sampling. arXiv preprint arXiv:2401.15645, 2024
-
[19]
J. Chen, L. Richter, J. Berner, D. Blessing, G. Neumann, and A. Anandkumar. Sequential controlled Langevin Diffusions . In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=dImD2sgy86
work page 2025
-
[20]
M.-H. Chen and Q.-M. Shao. On Monte Carlo methods for estimating ratios of normalizing constants. The Annals of Statistics, 25 0 (4): 0 1563 -- 1594, 1997. doi:10.1214/aos/1031594732. URL https://doi.org/10.1214/aos/1031594732
-
[21]
S. Chen, S. Chewi, J. Li, Y. Li, A. Salim, and A. Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=zyLVMgsZ0U_
work page 2023
-
[22]
Y. Chen, T. T. Georgiou, and M. Pavon. On the relation between optimal transport and Schr\"odinger bridges: A stochastic control viewpoint. Journal of Optimization Theory and Applications, 169: 0 671--691, 2016. doi:10.1007/s10957-015-0803-z
-
[23]
Y. Chen, T. T. Georgiou, and A. Tannenbaum. Stochastic control and nonequilibrium thermodynamics: Fundamental limits. IEEE Transactions on Automatic Control, 65 0 (7): 0 2979--2991, 2020. doi:10.1109/TAC.2019.2939625
-
[24]
Y. Chen, T. T. Georgiou, and M. Pavon. Stochastic control liaisons: Richard Sinkhorn meets Gaspard Monge on a Schr\"odinger bridge. SIAM Review, 63 0 (2): 0 249--313, 2021. doi:10.1137/20M1339982. URL https://doi.org/10.1137/20M1339982
-
[25]
X. Cheng, N. S. Chatterji, P. L. Bartlett, and M. I. Jordan. Underdamped Langevin MCMC : A non-asymptotic analysis. In S. Bubeck, V. Perchet, and P. Rigollet, editors, Proceedings of the 31st Conference On Learning Theory, volume 75 of Proceedings of Machine Learning Research, pages 300--323. PMLR, 06--09 Jul 2018. URL https://proceedings.mlr.press/v75/ch...
work page 2018
-
[26]
X. Cheng, B. Wang, J. Zhang, and Y. Zhu. Fast conditional mixing of MCMC algorithms for non-log-concave distributions. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 13374--13394. Curran Associates, Inc., 2023. URL https://proceedings.neurips.cc/paper_fil...
work page 2023
-
[27]
S. Chewi. Log-Concave Sampling. Book draft, in preparation, 2022. URL https://chewisinho.github.io
work page 2022
-
[28]
S. Chewi, M. A. Erdogdu, M. Li, R. Shen, and S. Zhang. Analysis of Langevin Monte Carlo from Poincar\'e to log- Sobolev . In P.-L. Loh and M. Raginsky, editors, Proceedings of Thirty Fifth Conference on Learning Theory, volume 178 of Proceedings of Machine Learning Research, pages 1--2. PMLR, 02--05 Jul 2022. URL https://proceedings.mlr.press/v178/chewi22a.html
work page 2022
-
[29]
C. Chipot and A. Pohorille, editors. Free Energy Calculations: Theory and Applications in Chemistry and Biology. Springer Series in Chemical Physics. Springer Berlin, Heidelberg, 2007. doi:10.1007/978-3-540-38448-9
-
[30]
G. Conforti and L. Tamanini. A formula for the time derivative of the entropic cost and applications. Journal of Functional Analysis, 280 0 (11): 0 108964, 2021. ISSN 0022-1236. doi:10.1016/j.jfa.2021.108964. URL https://www.sciencedirect.com/science/article/pii/S002212362100046X
-
[31]
B. Cousins and S. Vempala. Gaussian cooling and O^*(n^3) algorithms for volume and Gaussian volume. SIAM Journal on Computing, 47 0 (3): 0 1237--1273, 2018. doi:10.1137/15M1054250. URL https://doi.org/10.1137/15M1054250
-
[32]
G. E. Crooks. Nonequilibrium measurements of free energy differences for microscopically reversible Markovian systems. Journal of Statistical Physics, 90: 0 1481--1487, 1998. doi:10.1023/A:1023208217925
-
[33]
G. E. Crooks. Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences. Phys. Rev. E, 60: 0 2721--2726, Sep 1999. doi:10.1103/PhysRevE.60.2721. URL https://link.aps.org/doi/10.1103/PhysRevE.60.2721
-
[34]
P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68 0 (3): 0 411--436, 5 2006
work page 2006
-
[35]
A. Doucet, S. Godsill, and C. Andrieu. On sequential Monte Carlo sampling methods for bayesian filtering. Statistics and computing, 10: 0 197--208, 2000. doi:10.1023/A:1008935410038
-
[36]
A. Doucet, W. Grathwohl, A. G. Matthews, and H. Strathmann. Score-based diffusion meets annealed importance sampling. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 21482--21494. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/...
work page 2022
-
[37]
M. Dyer, A. Frieze, and R. Kannan. A random polynomial-time algorithm for approximating the volume of convex bodies. J. ACM, 38 0 (1): 0 1–17, Jan. 1991. ISSN 0004-5411. doi:10.1145/102782.102783. URL https://doi.org/10.1145/102782.102783
-
[38]
I. Echeverria and L. M. Amzel. Estimation of free-energy differences from computed work distributions: An application of Jarzynski 's equality. The Journal of Physical Chemistry B, 116 0 (36): 0 10977--11396, 2012. doi:10.1021/jp300527q
-
[39]
R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, H. Janati, A. Rakotomamonjy, I. Redko, A. Rolet, A. Schutz, V. Seguy, D. J. Sutherland, R. Tavenard, A. Tong, and T. Vayer. POT : Python optimal transport. Journal of Machine Learning Research, 22 0 (78):...
work page 2021
-
[40]
H. Ge and D.-Q. Jiang. Generalized Jarzynski 's equality of inhomogeneous multidimensional diffusion processes. Journal of Statistical Physics, 131: 0 675--689, 3 2008. ISSN 1572-9613. doi:10.1007/s10955-008-9520-4
-
[41]
R. Ge, H. Lee, and A. Risteski. Beyond log-concavity: Provable guarantees for sampling multi-modal distributions using simulated tempering Langevin Monte Carlo . In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL htt...
work page 2018
-
[42]
R. Ge, H. Lee, and J. Lu. Estimating normalizing constants for log-concave distributions: Algorithms and lower bounds. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, page 579–586, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450369794. doi:10.1145/3357713.3384289. URL https://doi.org/10....
-
[43]
A. Gelman and X.-L. Meng. Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Statistical Science, 13 0 (2): 0 163 -- 185, 1998. doi:10.1214/ss/1028905934. URL https://doi.org/10.1214/ss/1028905934
- [44]
-
[45]
W. Guo, M. Tao, and Y. Chen. Provable benefit of annealed Langevin Monte Carlo for non-log-concave sampling. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=P6IVIoGRRg
work page 2025
-
[46]
C. Hartmann and L. Richter. Nonasymptotic bounds for suboptimal importance sampling. SIAM/ASA Journal on Uncertainty Quantification, 12 0 (2): 0 309--346, 2024. doi:10.1137/21M1427760. URL https://doi.org/10.1137/21M1427760
-
[47]
C. Hartmann, L. Richter, C. Sch\"utte, and W. Zhang. Variational characterization of free energy: Theory and algorithms. Entropy, 19 0 (11), 2017. ISSN 1099-4300. doi:10.3390/e19110626. URL https://www.mdpi.com/1099-4300/19/11/626
-
[48]
C. Hartmann, C. Sch\"utte, and W. Zhang. Jarzynski 's equality, fluctuation theorems, and variance reduction: Mathematical analysis and numerical algorithms. Journal of Statistical Physics, 175: 0 1214--1261, 2019. doi:10.1007/s10955-019-02286-4. URL https://doi.org/10.1007/s10955-019-02286-4
- [49]
-
[50]
Y. He and C. Zhang. On the query complexity of sampling from non-log-concave distributions (extended abstract). In N. Haghtalab and A. Moitra, editors, Proceedings of Thirty Eighth Conference on Learning Theory, volume 291 of Proceedings of Machine Learning Research, pages 2786--2787. PMLR, 30 Jun--04 Jul 2025. URL https://proceedings.mlr.press/v291/he25a.html
work page 2025
-
[51]
Y. He, K. Rojas, and M. Tao. Zeroth-order sampling methods for non-log-concave distributions: Alleviating metastability by denoising diffusion. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=X3Aljulsw5
work page 2024
- [52]
-
[53]
X. Huang, D. Zou, H. Dong, Y.-A. Ma, and T. Zhang. Faster sampling without isoperimetry via diffusion-based Monte Carlo . In S. Agrawal and A. Roth, editors, Proceedings of Thirty Seventh Conference on Learning Theory, volume 247 of Proceedings of Machine Learning Research, pages 2438--2493. PMLR, 30 Jun--03 Jul 2024 b . URL https://proceedings.mlr.press/...
work page 2024
-
[54]
M. Huber. Approximation algorithms for the normalizing constant of Gibbs distributions. The Annals of Applied Probability, 25 0 (2): 0 974 -- 985, 2015. doi:10.1214/14-AAP1015. URL https://doi.org/10.1214/14-AAP1015
-
[55]
C. Jarzynski. Nonequilibrium equality for free energy differences. Phys. Rev. Lett., 78: 0 2690--2693, Apr 1997. doi:10.1103/PhysRevLett.78.2690. URL https://link.aps.org/doi/10.1103/PhysRevLett.78.2690
-
[56]
A. Jasra, K. Kamatani, P. P. Osei, and Y. Zhou. Multilevel particle filters: normalizing constant estimation. Statistics and Computing, 28: 0 47--60, 2018. doi:10.1007/s11222-016-9715-5
-
[57]
M. R. Jerrum, L. G. Valiant, and V. V. Vazirani. Random generation of combinatorial structures from a uniform distribution. Theoretical Computer Science, 43: 0 169--188, 1986. ISSN 0304-3975. doi:10.1016/0304-3975(86)90174-X. URL https://www.sciencedirect.com/science/article/pii/030439758690174X
-
[58]
I. Karatzas and S. E. Shreve. Brownian Motion and Stochastic Calculus . Graduate Texts in Mathematics. Springer New York, NY, 2 edition, 1991. doi:10.1007/978-1-4612-0949-2
-
[59]
D. P. Kingma and M. Welling. Auto-encoding variational Bayes . arXiv preprint arXiv:1312.6114, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[60]
J. G. Kirkwood. Statistical mechanics of fluid mixtures. The Journal of Chemical Physics, 3 0 (5): 0 300--313, 05 1935. ISSN 0021-9606. doi:10.1063/1.1749657. URL https://doi.org/10.1063/1.1749657
-
[61]
Y. Kook and S. S. Vempala. Sampling and integration of logconcave functions by algorithmic diffusion. In Proceedings of the 57th Annual ACM Symposium on Theory of Computing, STOC '25, page 924–932, New York, NY, USA, 2025. Association for Computing Machinery. ISBN 9798400715105. doi:10.1145/3717823.3718202. URL https://doi.org/10.1145/3717823.3718202
-
[62]
Y. Kook, S. Vempala, and M. S. Zhang. In-and-Out : Algorithmic diffusion for sampling convex bodies. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=aNQWRHyh15
work page 2024
-
[63]
S. Kostov and N. Whiteley. An algorithm for approximating the second moment of the normalizing constant estimate from a particle filter. Methodology and Computing in Applied Probability, 19: 0 799--818, 2017. doi:10.1007/s11009-016-9513-8
-
[64]
O. Krause, A. Fischer, and C. Igel. Algorithms for estimating the partition function of restricted Boltzmann machines. Artificial Intelligence, 278: 0 103195, 2020. ISSN 0004-3702. doi:10.1016/j.artint.2019.103195. URL https://www.sciencedirect.com/science/article/pii/S0004370219301948
-
[65]
C. Le Bris and P.-L. Lions. Existence and uniqueness of solutions to Fokker–Planck type equations with irregular coefficients. Communications in Partial Differential Equations, 33 0 (7): 0 1272--1317, 2008. doi:10.1080/03605300801970952. URL https://doi.org/10.1080/03605300801970952
-
[66]
H. Lee, J. Lu, and Y. Tan. Convergence of score-based generative modeling for general data distributions. In S. Agrawal and F. Orabona, editors, Proceedings of The 34th International Conference on Algorithmic Learning Theory, volume 201 of Proceedings of Machine Learning Research, pages 946--985. PMLR, 20 Feb--23 Feb 2023. URL https://proceedings.mlr.pres...
work page 2023
-
[67]
T. Leli\`evre, M. Rousset, and G. Stoltz. Free Energy Computations: A Mathematical Perspective. Imperial College Press, 2010. doi:10.1142/p579
-
[68]
C. L \'e onard. A survey of the Schr\"odinger problem and some of its connections with optimal transport. Discrete and Continuous Dynamical Systems - Series A, 34 0 (4): 0 1533--1574, 2014. URL https://hal.science/hal-00849930
work page 2014
-
[69]
J. Ma, J. Peng, S. Wang, and J. Xu. Estimating the partition function of graphical models using langevin importance sampling. In C. M. Carvalho and P. Ravikumar, editors, Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, volume 31 of Proceedings of Machine Learning Research, pages 433--441, Scottsdale, Arizon...
work page 2013
-
[70]
B. M\'at\'e and F. Fleuret. Learning interpolations between Boltzmann densities. Transactions on Machine Learning Research, 2023. ISSN 2835-8856. URL https://openreview.net/forum?id=TH6YrEcbth
work page 2023
-
[71]
B. M\'at\'e, F. Fleuret, and T. Bereau. Neural thermodynamic integration: Free energies from energy-based diffusion models. The Journal of Physical Chemistry Letters, 15 0 (45): 0 11395--11404, 2024. doi:10.1021/acs.jpclett.4c01958. URL https://doi.org/10.1021/acs.jpclett.4c01958. PMID: 39503734
-
[72]
Exactly solvable model illustrating far-from-equilibrium predictions
O. Mazonka and C. Jarzynski. Exactly solvable model illustrating far-from-equilibrium predictions. arXiv preprint cond-mat/9912121, 1999
work page internal anchor Pith review Pith/arXiv arXiv 1999
-
[73]
F. Mazzanti and E. Romero. Efficient evaluation of the partition function of RBMs with annealed importance sampling. arXiv preprint arXiv:2007.11926, 2020
-
[74]
X.-L. Meng and W. H. Wong. Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statistica Sinica, 6 0 (4): 0 831--860, 1996. ISSN 10170405, 19968507. URL http://www.jstor.org/stable/24306045
-
[75]
A. Mousavi-Hosseini, T. K. Farghly, Y. He, K. Balasubramanian, and M. A. Erdogdu. Towards a complete analysis of Langevin Monte Carlo : Beyond Poincar\'e inequality. In G. Neu and L. Rosasco, editors, Proceedings of Thirty Sixth Conference on Learning Theory, volume 195 of Proceedings of Machine Learning Research, pages 1--35. PMLR, 12--15 Jul 2023. URL h...
work page 2023
-
[76]
R. M. Neal. Annealed importance sampling. Statistics and Computing, 11 0 (2): 0 125--139, April 2001. ISSN 1573-1375. doi:10.1023/A:1008923215028. URL https://doi.org/10.1023/A:1008923215028
-
[77]
E. Nelson. Dynamical Theories of Brownian Motion. Princeton University Press, 1967. ISBN 9780691079509. URL http://www.jstor.org/stable/j.ctv15r57jg
work page 1967
-
[78]
N. N\"usken and L. Richter. Solving high-dimensional Hamilton -- Jacobi -- Bellman PDEs using neural networks: perspectives from the theory of controlled diffusions and measures on path space. Partial differential equations and applications, 2 0 (4): 0 48, 2021. doi:10.1007/s42985-021-00102-x
-
[79]
A. Pohorille, C. Jarzynski, and C. Chipot. Good practices in free-energy calculations. The Journal of Physical Chemistry B, 114 0 (32): 0 10235--10253, 2010. ISSN 1520-6106. doi:10.1021/jp102971x. URL https://doi.org/10.1021/jp102971x
-
[80]
Y. Ren, H. Chen, G. M. Rotskoff, and L. Ying. How discrete and continuous diffusion meet: Comprehensive analysis of discrete diffusion models via a stochastic integral framework. In The Thirteenth International Conference on Learning Representations, 2025 a . URL https://openreview.net/forum?id=6awxwQEI82
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.