Scalable Inference-Time Annealing with Surrogate Likelihood Estimators

Daniel Pe\~naherrera; David Ryan Koes; Rishal Aggarwal

arxiv: 2605.31498 · v3 · pith:ZCPL27UWnew · submitted 2026-05-29 · 💻 cs.LG · q-bio.BM

Scalable Inference-Time Annealing with Surrogate Likelihood Estimators

Daniel Pe\~naherrera , Rishal Aggarwal , David Ryan Koes This is my paper

Pith reviewed 2026-06-28 22:53 UTC · model grok-4.3

classification 💻 cs.LG q-bio.BM

keywords inference-time annealingflow-based modelsenergy-based modelsBoltzmann samplingmolecular simulationgenerative modelingsurrogate likelihoodsalanine peptides

0 comments

The pith

SITA retrains flow-based models with energy-based surrogate likelihoods to anneal samples down a temperature ladder without computing divergences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that inference-time annealing of generative models for molecular Boltzmann distributions can be made scalable by replacing expensive divergence calculations with fast surrogate likelihoods supplied by a separate energy-based model. This substitution lets the method retrain flow models iteratively at lower temperatures using importance sampling, which had previously been limited to small systems. A sympathetic reader would care because conventional molecular sampling relies on slow simulations while existing generative approaches hit computational walls on larger molecules; if SITA works, it removes one of those walls for peptides and potentially beyond.

Core claim

SITA performs scalable inference-time annealing by retraining flow-based generative models along a temperature ladder, where an auxiliary energy-based model supplies surrogate likelihood estimates that replace the divergence-based importance weights required in prior methods.

What carries the argument

energy-based surrogate likelihood estimator that replaces divergence-based importance weights during retraining of the flow model at each temperature step

If this is right

The method becomes applicable to molecular systems where computing the score-field divergence is intractable.
Retraining cost is reduced because surrogate likelihood evaluation is cheaper than divergence estimation at each annealing step.
Sample quality at low temperatures improves without the overhead that previously limited annealing depth.
The approach stays within the flow-model family while sidestepping a specific computational bottleneck.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same surrogate-likelihood trick might transfer to other generative architectures that currently rely on divergence weighting.
If the energy-based surrogate remains accurate at very low temperatures, the method could reach conformational states that are inaccessible to standard molecular dynamics.
Testing the surrogate accuracy on a held-out set of configurations would give an early diagnostic before full annealing runs.

Load-bearing premise

An auxiliary energy-based model can supply sufficiently accurate and unbiased surrogate likelihood estimates to stand in for the true divergence terms across the entire temperature ladder.

What would settle it

Running SITA and a divergence-based baseline on alanine dipeptide or tripeptide and finding that the surrogate version produces distributions with measurably higher deviation from the reference Boltzmann density or lower effective sample size.

Figures

Figures reproduced from arXiv: 2605.31498 by Daniel Pe\~naherrera, David Ryan Koes, Rishal Aggarwal.

**Figure 1.** Figure 1: SITA training loop: A flow model θ trained on high-temperature samples is used to generate proposals for training an energy-based model ϕ. Importance-weighted resampling with the learned surrogate likelihoods produces samples at lower temperatures, which seeds the next annealing step without expensive Jacobian computations. • Surrogate-driven annealed importance sampling. We integrate a BoltzNCEstyle surr… view at source ↗

**Figure 2.** Figure 2: Alanine dipeptide comparison on 30,000 samples from both SITA and MD simulation. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: TICA projection density scatter plots comparing MD-generated and SITA flow [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: TICA downsampling comparison at different lag times. All plots represent [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗

read the original abstract

A long standing challenge in computational chemistry and biophysics is efficiently sampling the Boltzmann distribution of molecules. Advances in generative modeling have been proposed to address the limitations of conventional sampling techniques by eliminating the computational cost of simulation. A promising direction is iteratively finetuning diffusion models along a temperature ladder whereby training data is generated via importance sampling during inference-time annealing. Unfortunately, these methods require computing a divergence over the score field to estimate importance weights, rendering them intractable for larger systems. Here we present scalable inference-time annealing (SITA), which retrains flow-based models to generate samples at progressively lower temperatures using an energy-based model to facilitate fast surrogate likelihoods. We demonstrate state-of-the-art performance on both Alanine Dipeptide and Alanine Tripeptide while avoiding costly divergence terms. Our code is available at https://github.com/countrsignal/sita.git

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SITA swaps divergence-based importance weights for energy-based surrogate likelihoods to scale inference-time annealing on molecular systems, with results on alanine peptides but open questions on bias accumulation.

read the letter

The core contribution is replacing the intractable divergence-over-score computation with surrogate likelihoods from a retrained energy-based model. This lets the annealing schedule run without that cost, and the authors report state-of-the-art results on alanine dipeptide and tripeptide while releasing code.

That substitution is the actual novelty, and the empirical demonstration on standard small-peptide benchmarks is concrete. Releasing the implementation is also helpful for anyone who wants to test the approach on similar systems.

The soft spot is the one flagged in the stress test. The surrogates need to stay accurate enough across the temperature ladder so the generated samples still match the target Boltzmann distribution. Without reported diagnostics that directly compare surrogate weights to exact likelihoods at each step, it is hard to rule out accumulated approximation error. The abstract does not mention such checks, so the full paper needs to show them.

This work is for groups already working on generative models for molecular sampling in biophysics or drug design. A reader who needs a practical way to avoid the divergence term will find the method and the released code useful.

It deserves peer review because it targets a known scaling barrier with a clear technical change and reproducible results on accepted benchmarks, even if the bias analysis requires more detail.

Referee Report

1 major / 0 minor

Summary. The paper introduces Scalable Inference-Time Annealing (SITA), a method that retrains flow-based models along a temperature ladder for sampling Boltzmann distributions of molecules. It replaces divergence-based importance weights with fast surrogate likelihoods obtained from an auxiliary energy-based model, claiming this enables scalable inference-time annealing and yields state-of-the-art performance on Alanine Dipeptide and Alanine Tripeptide while avoiding costly divergence computations. Code is released at https://github.com/countrsignal/sita.git.

Significance. If the surrogate estimates remain sufficiently accurate and unbiased across the annealing schedule, the approach could meaningfully extend generative modeling techniques to larger biomolecular systems by eliminating a key computational bottleneck. The public release of code is a positive step toward reproducibility.

major comments (1)

[Abstract / Method description] The central performance claim depends on the surrogate likelihoods from the retrained energy-based model supplying sufficiently accurate and unbiased estimates to replace divergence-based importance weights throughout the temperature ladder. However, the manuscript provides no direct diagnostic (e.g., KL divergence, log-weight error, or effective sample size comparison between surrogate and exact likelihoods on held-out configurations) at each temperature step, leaving open the possibility that accumulated approximation error degrades sample quality even when final metrics appear competitive.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback on our manuscript. We address the major comment below and will revise the manuscript accordingly to strengthen the validation of the surrogate likelihoods.

read point-by-point responses

Referee: [Abstract / Method description] The central performance claim depends on the surrogate likelihoods from the retrained energy-based model supplying sufficiently accurate and unbiased estimates to replace divergence-based importance weights throughout the temperature ladder. However, the manuscript provides no direct diagnostic (e.g., KL divergence, log-weight error, or effective sample size comparison between surrogate and exact likelihoods on held-out configurations) at each temperature step, leaving open the possibility that accumulated approximation error degrades sample quality even when final metrics appear competitive.

Authors: We agree that direct diagnostics comparing the surrogate likelihoods to exact divergence-based weights would provide stronger support for the central claim. In the revised manuscript we will add evaluations of KL divergence between surrogate and exact log-weights, log-weight error statistics, and effective sample size ratios on held-out configurations at each temperature step along the annealing ladder. These results will be reported both in the main text and in an expanded supplementary section. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on external benchmarks, not self-defined fits

full rationale

The paper introduces SITA by retraining flow models with an auxiliary energy-based surrogate for likelihoods during temperature annealing, avoiding explicit divergence terms. The central result is an empirical demonstration of state-of-the-art sampling performance on Alanine Dipeptide and Tripeptide. No derivation step reduces a claimed prediction to a quantity defined by the method itself, no fitted parameter is relabeled as a prediction, and no load-bearing premise depends on a self-citation chain or imported uniqueness theorem. The surrogate is presented as an independent modeling choice whose accuracy is assessed via downstream sampling quality on held-out molecular systems, keeping the argument self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities beyond the high-level modeling choice of an energy-based surrogate; ledger therefore remains empty.

pith-pipeline@v0.9.1-grok · 5684 in / 1018 out tokens · 19176 ms · 2026-06-28T22:53:25.401787+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 8 canonical work pages

[1]

Aggarwal, J

R. Aggarwal, J. Chen, N. M. Boffi, and D. R. Koes. BoltzNCE : Learning likelihoods for boltzmann generation with stochastic interpolants and noise contrastive estimation. arXiv preprint arXiv:2507.00846, 2025

arXiv 2025
[2]

Akhound-Sadegh, J

T. Akhound-Sadegh, J. Rector-Brooks, A. J. Bose, S. Mittal, P. Lemos, C.-H. Liu, M. Sendera, S. Ravanbakhsh, G. Gidel, Y. Bengio, N. Malkin, and A. Tong. Iterated denoising energy matching for sampling from boltzmann densities, 2024. URL https://arxiv.org/abs/2402.06121

arXiv 2024
[3]

Akhound-Sadegh, J

T. Akhound-Sadegh, J. Lee, A. J. Bose, V. De Bortoli, A. Doucet, M. M. Bronstein, D. Beaini, S. Ravanbakhsh, K. Neklyudov, and A. Tong. Progressive inference-time annealing of diffusion models for sampling from boltzmann densities. arXiv preprint arXiv:2506.16471, 2025

arXiv 2025
[4]

M. S. Albergo and E. Vanden-Eijnden. Nets: A non-equilibrium transport sampler, 2025. URL https://arxiv.org/abs/2410.02711

arXiv 2025
[5]

M. S. Albergo, N. M. Boffi, and E. Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions. arXiv preprint arXiv:2303.08797, 2023

Pith/arXiv arXiv 2023
[6]

Blessing, J

D. Blessing, J. Berner, L. Richter, and G. Neumann. Underdamped diffusion bridges with applications to sampling, 2025. URL https://arxiv.org/abs/2503.01006

arXiv 2025
[7]

Blessing, L

D. Blessing, L. Richter, J. Berner, E. Malitskiy, and G. Neumann. Bridge matching sampler: Scalable sampling via generalized fixed-point diffusion matching, 2026. URL https://arxiv.org/abs/2603.00530

arXiv 2026
[8]

V. D. Bortoli, M. Hutchinson, P. Wirnsberger, and A. Doucet. Target score matching, 2024. URL https://arxiv.org/abs/2402.08667

arXiv 2024
[9]

R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud. Neural ordinary differential equations, 2019. URL https://arxiv.org/abs/1806.07366

Pith/arXiv arXiv 2019
[10]

Dibak, L

M. Dibak, L. Klein, A. Kr\"amer, and F. No\'e. Temperature steerable flows and boltzmann generators. Phys. Rev. Res., 4: 0 L042005, Oct 2022. doi:10.1103/PhysRevResearch.4.L042005. URL https://link.aps.org/doi/10.1103/PhysRevResearch.4.L042005

work page doi:10.1103/physrevresearch.4.l042005 2022
[11]

Dunn and D

I. Dunn and D. R. Koes. Mixed continuous and categorical flow matching for 3d de novo molecule generation. arXiv:2404.19739 [q-bio.BM], 2024. URL https://arxiv.org/abs/2404.19739

arXiv 2024
[12]

M. F. Faulkner and S. Livingstone. Sampling algorithms in statistical physics: A guide for statistics and machine learning. Statistical Science, 39 0 (1), Feb. 2024. ISSN 0883-4237. doi:10.1214/23-sts893. URL http://dx.doi.org/10.1214/23-STS893

work page doi:10.1214/23-sts893 2024
[13]

Flamary, N

R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, H. Janati, A. Rakotomamonjy, I. Redko, A. Rolet, A. Schutz, V. Seguy, D. J. Sutherland, R. Tavenard, A. Tong, and T. Vayer. Pot: Python optimal transport. Journal of Machine Learning Research, 22 0 (78): ...

2021
[14]

Gabrié, G

M. Gabrié, G. M. Rotskoff, and E. Vanden-Eijnden. Adaptive monte carlo augmented with normalizing flows. Proceedings of the National Academy of Sciences, 119 0 (10): 0 e2109420119, 2022. doi:10.1073/pnas.2109420119. URL https://www.pnas.org/doi/abs/10.1073/pnas.2109420119

work page doi:10.1073/pnas.2109420119 2022
[15]

Gutmann and A

M. Gutmann and A. Hyv \"a rinen. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 297--304. JMLR Workshop and Conference Proceedings, 2010

2010
[16]

J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 0 6840--6851, 2020

2020
[17]

Hyv \"a rinen

A. Hyv \"a rinen. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6 0 (24): 0 695--709, 2005. URL http://jmlr.org/papers/v6/hyvarinen05a.html

2005
[18]

Hénin, T

J. Hénin, T. Lelièvre, M. R. Shirts, O. Valsson, and L. Delemotte. Enhanced sampling methods for molecular dynamics simulations [article v1.0]. Living Journal of Computational Molecular Science, 4 0 (1): 0 1583, Dec. 2022. ISSN 2575-6524. doi:10.33011/livecoms.4.1.1583. URL http://dx.doi.org/10.33011/livecoms.4.1.1583

work page doi:10.33011/livecoms.4.1.1583 2022
[19]

Jarzynski

C. Jarzynski. Nonequilibrium equality for free energy differences. Physical Review Letters, 78 0 (14): 0 2690–2693, Apr. 1997. ISSN 1079-7114. doi:10.1103/physrevlett.78.2690. URL http://dx.doi.org/10.1103/PhysRevLett.78.2690

work page doi:10.1103/physrevlett.78.2690 1997
[20]

B. Jing, S. Eismann, P. N. Soni, and R. O. Dror. Equivariant graph neural networks for 3d macromolecular structure. arXiv preprint arXiv:2106.03843, 2021 a

arXiv 2021
[21]

B. Jing, S. Eismann, P. Suriana, R. J. L. Townshend, and R. Dror. Learning from protein structure with geometric vector perceptrons, 2021 b . URL https://arxiv.org/abs/2009.01411

arXiv 2021
[22]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization, 2017. URL https://arxiv.org/abs/1412.6980

Pith/arXiv arXiv 2017
[23]

Köhler, L

J. Köhler, L. Klein, and F. Noé. Equivariant flows: Exact likelihood generative learning for symmetric densities, 2020. URL https://arxiv.org/abs/2006.02425

arXiv 2020
[25]

Lipman, R

Y. Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le. Flow matching for generative modeling, 2023. URL https://arxiv.org/abs/2210.02747

Pith/arXiv arXiv 2023
[26]

G.-H. Liu, J. Choi, Y. Chen, B. K. Miller, and R. T. Q. Chen. Adjoint schr\"odinger bridge sampler, 2025. URL https://arxiv.org/abs/2506.22565

arXiv 2025
[28]

X. Liu, C. Gong, and Q. Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow, 2022 b . URL https://arxiv.org/abs/2209.03003

Pith/arXiv arXiv 2022
[29]

N. Ma, M. Goldstein, M. S. Albergo, N. M. Boffi, E. Vanden-Eijnden, and S. Xie. Sit: Exploring flow and diffusion-based generative models with scalable interpolant transformers. In European Conference on Computer Vision, pages 23--40. Springer, 2024

2024
[30]

L. I. Midgley, V. Stimper, G. N. C. Simm, B. Schölkopf, and J. M. Hernández-Lobato. Flow annealed importance sampling bootstrap, 2023. URL https://arxiv.org/abs/2208.01893

arXiv 2023
[31]

P. D. Moral and A. Doucet. Sequential monte carlo samplers, 2002. URL https://arxiv.org/abs/cond-mat/0212648

Pith/arXiv arXiv 2002
[32]

R. M. Neal. Annealed importance sampling, 1998. URL https://arxiv.org/abs/physics/9803008

Pith/arXiv arXiv 1998
[33]

No \'e , S

F. No \'e , S. Olsson, J. K \"o hler, and H. Wu. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science, 365 0 (6457): 0 eaaw1147, 2019

2019
[34]

F. Noé, S. Olsson, J. Köhler, and H. Wu. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science, 365 0 (6457): 0 eaaw1147, 2019. doi:10.1126/science.aaw1147. URL https://www.science.org/doi/abs/10.1126/science.aaw1147

work page doi:10.1126/science.aaw1147 2019
[35]

A. v. d. Oord, Y. Li, and O. Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018

Pith/arXiv arXiv 2018
[36]

Pérez-Hernández, F

G. Pérez-Hernández, F. Paul, T. Giorgino, G. De Fabritiis, and F. Noé. Identification of slow molecular order parameters for markov model construction. The Journal of Chemical Physics, 139 0 (1), July 2013. ISSN 1089-7690. doi:10.1063/1.4811489. URL http://dx.doi.org/10.1063/1.4811489

work page doi:10.1063/1.4811489 2013
[37]

Richter and J

L. Richter and J. Berner. Improved sampling via learned diffusions, 2024. URL https://arxiv.org/abs/2307.01198

arXiv 2024
[38]

V. G. Satorras, E. Hoogeboom, and M. Welling. E (n) equivariant graph neural networks. In International conference on machine learning, pages 9323--9332. PMLR, 2021

2021
[39]

V. G. Satorras, E. Hoogeboom, and M. Welling. E(n) equivariant graph neural networks, 2022. URL https://arxiv.org/abs/2102.09844

arXiv 2022
[40]

Schopmans and P

H. Schopmans and P. Friederich. Temperature-annealed boltzmann generators, 2025. URL https://arxiv.org/abs/2501.19077

arXiv 2025
[41]

C. R. Schwantes and V. S. Pande. Improvements in markov state model construction reveal many non-native interactions in the folding of ntl9. Journal of Chemical Theory and Computation, 9 0 (4): 0 2000--2009, 2013. doi:10.1021/ct300878a. URL https://doi.org/10.1021/ct300878a. PMID: 23750122

work page doi:10.1021/ct300878a 2000
[42]

Sohl-Dickstein, E

J. Sohl-Dickstein, E. A. Weiss, N. Maheswaranathan, and S. Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics, 2015. URL https://arxiv.org/abs/1503.03585

Pith/arXiv arXiv 2015
[44]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations, 2021. URL https://arxiv.org/abs/2011.13456

Pith/arXiv arXiv 2021
[45]

Vargas, W

F. Vargas, W. Grathwohl, and A. Doucet. Denoising diffusion samplers, 2023. URL https://arxiv.org/abs/2302.13834

arXiv 2023
[46]

Vargas, S

F. Vargas, S. Padhy, D. Blessing, and N. Nüsken. Transport meets variational inference: Controlled monte carlo diffusions, 2025. URL https://arxiv.org/abs/2307.01050

arXiv 2025
[47]

von Klitzing, D

C. von Klitzing, D. Blessing, H. Schopmans, P. Friederich, and G. Neumann. Learning boltzmann generators via constrained mass transport, 2026. URL https://arxiv.org/abs/2510.18460

arXiv 2026
[48]

Wirnsberger, A

P. Wirnsberger, A. J. Ballard, G. Papamakarios, S. Abercrombie, S. Racani \`e re, A. Pritzel, D. Jimenez Rezende, and C. Blundell. Targeted free energy estimation via learned mappings. The Journal of Chemical Physics, 153 0 (14), 2020

2020
[49]

J. Yang, Z. Liu, S. Xiao, C. Li, D. Lian, S. Agrawal, A. Singh, G. Sun, and X. Xie. Graphformers: Gnn-nested transformers for representation learning on textual graph, 2023. URL https://arxiv.org/abs/2105.02605

arXiv 2023
[50]

C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T.-Y. Liu. Do transformers really perform badly for graph representation? Advances in neural information processing systems, 34: 0 28877--28888, 2021

2021
[51]

Zhang and Y

Q. Zhang and Y. Chen. Path integral sampler: a stochastic control approach for sampling, 2022. URL https://arxiv.org/abs/2111.15141

arXiv 2022

[1] [1]

Aggarwal, J

R. Aggarwal, J. Chen, N. M. Boffi, and D. R. Koes. BoltzNCE : Learning likelihoods for boltzmann generation with stochastic interpolants and noise contrastive estimation. arXiv preprint arXiv:2507.00846, 2025

arXiv 2025

[2] [2]

Akhound-Sadegh, J

T. Akhound-Sadegh, J. Rector-Brooks, A. J. Bose, S. Mittal, P. Lemos, C.-H. Liu, M. Sendera, S. Ravanbakhsh, G. Gidel, Y. Bengio, N. Malkin, and A. Tong. Iterated denoising energy matching for sampling from boltzmann densities, 2024. URL https://arxiv.org/abs/2402.06121

arXiv 2024

[3] [3]

Akhound-Sadegh, J

T. Akhound-Sadegh, J. Lee, A. J. Bose, V. De Bortoli, A. Doucet, M. M. Bronstein, D. Beaini, S. Ravanbakhsh, K. Neklyudov, and A. Tong. Progressive inference-time annealing of diffusion models for sampling from boltzmann densities. arXiv preprint arXiv:2506.16471, 2025

arXiv 2025

[4] [4]

M. S. Albergo and E. Vanden-Eijnden. Nets: A non-equilibrium transport sampler, 2025. URL https://arxiv.org/abs/2410.02711

arXiv 2025

[5] [5]

M. S. Albergo, N. M. Boffi, and E. Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions. arXiv preprint arXiv:2303.08797, 2023

Pith/arXiv arXiv 2023

[6] [6]

Blessing, J

D. Blessing, J. Berner, L. Richter, and G. Neumann. Underdamped diffusion bridges with applications to sampling, 2025. URL https://arxiv.org/abs/2503.01006

arXiv 2025

[7] [7]

Blessing, L

D. Blessing, L. Richter, J. Berner, E. Malitskiy, and G. Neumann. Bridge matching sampler: Scalable sampling via generalized fixed-point diffusion matching, 2026. URL https://arxiv.org/abs/2603.00530

arXiv 2026

[8] [8]

V. D. Bortoli, M. Hutchinson, P. Wirnsberger, and A. Doucet. Target score matching, 2024. URL https://arxiv.org/abs/2402.08667

arXiv 2024

[9] [9]

R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud. Neural ordinary differential equations, 2019. URL https://arxiv.org/abs/1806.07366

Pith/arXiv arXiv 2019

[10] [10]

Dibak, L

M. Dibak, L. Klein, A. Kr\"amer, and F. No\'e. Temperature steerable flows and boltzmann generators. Phys. Rev. Res., 4: 0 L042005, Oct 2022. doi:10.1103/PhysRevResearch.4.L042005. URL https://link.aps.org/doi/10.1103/PhysRevResearch.4.L042005

work page doi:10.1103/physrevresearch.4.l042005 2022

[11] [11]

Dunn and D

I. Dunn and D. R. Koes. Mixed continuous and categorical flow matching for 3d de novo molecule generation. arXiv:2404.19739 [q-bio.BM], 2024. URL https://arxiv.org/abs/2404.19739

arXiv 2024

[12] [12]

M. F. Faulkner and S. Livingstone. Sampling algorithms in statistical physics: A guide for statistics and machine learning. Statistical Science, 39 0 (1), Feb. 2024. ISSN 0883-4237. doi:10.1214/23-sts893. URL http://dx.doi.org/10.1214/23-STS893

work page doi:10.1214/23-sts893 2024

[13] [13]

Flamary, N

R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, H. Janati, A. Rakotomamonjy, I. Redko, A. Rolet, A. Schutz, V. Seguy, D. J. Sutherland, R. Tavenard, A. Tong, and T. Vayer. Pot: Python optimal transport. Journal of Machine Learning Research, 22 0 (78): ...

2021

[14] [14]

Gabrié, G

M. Gabrié, G. M. Rotskoff, and E. Vanden-Eijnden. Adaptive monte carlo augmented with normalizing flows. Proceedings of the National Academy of Sciences, 119 0 (10): 0 e2109420119, 2022. doi:10.1073/pnas.2109420119. URL https://www.pnas.org/doi/abs/10.1073/pnas.2109420119

work page doi:10.1073/pnas.2109420119 2022

[15] [15]

Gutmann and A

M. Gutmann and A. Hyv \"a rinen. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 297--304. JMLR Workshop and Conference Proceedings, 2010

2010

[16] [16]

J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 0 6840--6851, 2020

2020

[17] [17]

Hyv \"a rinen

A. Hyv \"a rinen. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6 0 (24): 0 695--709, 2005. URL http://jmlr.org/papers/v6/hyvarinen05a.html

2005

[18] [18]

Hénin, T

J. Hénin, T. Lelièvre, M. R. Shirts, O. Valsson, and L. Delemotte. Enhanced sampling methods for molecular dynamics simulations [article v1.0]. Living Journal of Computational Molecular Science, 4 0 (1): 0 1583, Dec. 2022. ISSN 2575-6524. doi:10.33011/livecoms.4.1.1583. URL http://dx.doi.org/10.33011/livecoms.4.1.1583

work page doi:10.33011/livecoms.4.1.1583 2022

[19] [19]

Jarzynski

C. Jarzynski. Nonequilibrium equality for free energy differences. Physical Review Letters, 78 0 (14): 0 2690–2693, Apr. 1997. ISSN 1079-7114. doi:10.1103/physrevlett.78.2690. URL http://dx.doi.org/10.1103/PhysRevLett.78.2690

work page doi:10.1103/physrevlett.78.2690 1997

[20] [20]

B. Jing, S. Eismann, P. N. Soni, and R. O. Dror. Equivariant graph neural networks for 3d macromolecular structure. arXiv preprint arXiv:2106.03843, 2021 a

arXiv 2021

[21] [21]

B. Jing, S. Eismann, P. Suriana, R. J. L. Townshend, and R. Dror. Learning from protein structure with geometric vector perceptrons, 2021 b . URL https://arxiv.org/abs/2009.01411

arXiv 2021

[22] [22]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization, 2017. URL https://arxiv.org/abs/1412.6980

Pith/arXiv arXiv 2017

[23] [23]

Köhler, L

J. Köhler, L. Klein, and F. Noé. Equivariant flows: Exact likelihood generative learning for symmetric densities, 2020. URL https://arxiv.org/abs/2006.02425

arXiv 2020

[24] [25]

Lipman, R

Y. Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le. Flow matching for generative modeling, 2023. URL https://arxiv.org/abs/2210.02747

Pith/arXiv arXiv 2023

[25] [26]

G.-H. Liu, J. Choi, Y. Chen, B. K. Miller, and R. T. Q. Chen. Adjoint schr\"odinger bridge sampler, 2025. URL https://arxiv.org/abs/2506.22565

arXiv 2025

[26] [28]

X. Liu, C. Gong, and Q. Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow, 2022 b . URL https://arxiv.org/abs/2209.03003

Pith/arXiv arXiv 2022

[27] [29]

N. Ma, M. Goldstein, M. S. Albergo, N. M. Boffi, E. Vanden-Eijnden, and S. Xie. Sit: Exploring flow and diffusion-based generative models with scalable interpolant transformers. In European Conference on Computer Vision, pages 23--40. Springer, 2024

2024

[28] [30]

L. I. Midgley, V. Stimper, G. N. C. Simm, B. Schölkopf, and J. M. Hernández-Lobato. Flow annealed importance sampling bootstrap, 2023. URL https://arxiv.org/abs/2208.01893

arXiv 2023

[29] [31]

P. D. Moral and A. Doucet. Sequential monte carlo samplers, 2002. URL https://arxiv.org/abs/cond-mat/0212648

Pith/arXiv arXiv 2002

[30] [32]

R. M. Neal. Annealed importance sampling, 1998. URL https://arxiv.org/abs/physics/9803008

Pith/arXiv arXiv 1998

[31] [33]

No \'e , S

F. No \'e , S. Olsson, J. K \"o hler, and H. Wu. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science, 365 0 (6457): 0 eaaw1147, 2019

2019

[32] [34]

F. Noé, S. Olsson, J. Köhler, and H. Wu. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science, 365 0 (6457): 0 eaaw1147, 2019. doi:10.1126/science.aaw1147. URL https://www.science.org/doi/abs/10.1126/science.aaw1147

work page doi:10.1126/science.aaw1147 2019

[33] [35]

A. v. d. Oord, Y. Li, and O. Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018

Pith/arXiv arXiv 2018

[34] [36]

Pérez-Hernández, F

G. Pérez-Hernández, F. Paul, T. Giorgino, G. De Fabritiis, and F. Noé. Identification of slow molecular order parameters for markov model construction. The Journal of Chemical Physics, 139 0 (1), July 2013. ISSN 1089-7690. doi:10.1063/1.4811489. URL http://dx.doi.org/10.1063/1.4811489

work page doi:10.1063/1.4811489 2013

[35] [37]

Richter and J

L. Richter and J. Berner. Improved sampling via learned diffusions, 2024. URL https://arxiv.org/abs/2307.01198

arXiv 2024

[36] [38]

V. G. Satorras, E. Hoogeboom, and M. Welling. E (n) equivariant graph neural networks. In International conference on machine learning, pages 9323--9332. PMLR, 2021

2021

[37] [39]

V. G. Satorras, E. Hoogeboom, and M. Welling. E(n) equivariant graph neural networks, 2022. URL https://arxiv.org/abs/2102.09844

arXiv 2022

[38] [40]

Schopmans and P

H. Schopmans and P. Friederich. Temperature-annealed boltzmann generators, 2025. URL https://arxiv.org/abs/2501.19077

arXiv 2025

[39] [41]

C. R. Schwantes and V. S. Pande. Improvements in markov state model construction reveal many non-native interactions in the folding of ntl9. Journal of Chemical Theory and Computation, 9 0 (4): 0 2000--2009, 2013. doi:10.1021/ct300878a. URL https://doi.org/10.1021/ct300878a. PMID: 23750122

work page doi:10.1021/ct300878a 2000

[40] [42]

Sohl-Dickstein, E

J. Sohl-Dickstein, E. A. Weiss, N. Maheswaranathan, and S. Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics, 2015. URL https://arxiv.org/abs/1503.03585

Pith/arXiv arXiv 2015

[41] [44]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations, 2021. URL https://arxiv.org/abs/2011.13456

Pith/arXiv arXiv 2021

[42] [45]

Vargas, W

F. Vargas, W. Grathwohl, and A. Doucet. Denoising diffusion samplers, 2023. URL https://arxiv.org/abs/2302.13834

arXiv 2023

[43] [46]

Vargas, S

F. Vargas, S. Padhy, D. Blessing, and N. Nüsken. Transport meets variational inference: Controlled monte carlo diffusions, 2025. URL https://arxiv.org/abs/2307.01050

arXiv 2025

[44] [47]

von Klitzing, D

C. von Klitzing, D. Blessing, H. Schopmans, P. Friederich, and G. Neumann. Learning boltzmann generators via constrained mass transport, 2026. URL https://arxiv.org/abs/2510.18460

arXiv 2026

[45] [48]

Wirnsberger, A

P. Wirnsberger, A. J. Ballard, G. Papamakarios, S. Abercrombie, S. Racani \`e re, A. Pritzel, D. Jimenez Rezende, and C. Blundell. Targeted free energy estimation via learned mappings. The Journal of Chemical Physics, 153 0 (14), 2020

2020

[46] [49]

J. Yang, Z. Liu, S. Xiao, C. Li, D. Lian, S. Agrawal, A. Singh, G. Sun, and X. Xie. Graphformers: Gnn-nested transformers for representation learning on textual graph, 2023. URL https://arxiv.org/abs/2105.02605

arXiv 2023

[47] [50]

C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T.-Y. Liu. Do transformers really perform badly for graph representation? Advances in neural information processing systems, 34: 0 28877--28888, 2021

2021

[48] [51]

Zhang and Y

Q. Zhang and Y. Chen. Path integral sampler: a stochastic control approach for sampling, 2022. URL https://arxiv.org/abs/2111.15141

arXiv 2022