arxiv: 2512.16812 · v4 · submitted 2025-12-18 · ❄️ cond-mat.stat-mech · physics.comp-ph

Efficient Monte Carlo sampling of metastable systems using non-local collective variable updates

Christoph Sch\"onle , Davide Carbone , Marylou Gabri\'e , Tony Leli\`evre , Gabriel Stoltz This is my paper

Pith reviewed 2026-05-16 21:04 UTC · model grok-4.3

classification ❄️ cond-mat.stat-mech physics.comp-ph

keywords Monte Carlo samplingcollective variablesmetastable systemsLangevin dynamicsnon-local updatesreversibilitymolecular simulationmachine learning proposals

0 comments

The pith

Non-local collective variable updates extend to non-linear CVs and underdamped Langevin dynamics while preserving reversibility and improving sampling efficiency for metastable molecular systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper generalizes non-local proposal updates in collective variable space to create a Monte Carlo scheme that handles non-linear CVs and underdamped Langevin dynamics. It provides an explicit algorithm, proves reversibility so the method samples the correct equilibrium distribution, and shows through numerical tests a clear performance gain over earlier overdamped versions. This matters for simulating complex molecular systems that get stuck in metastable states, because machine-learning proposal generators now make non-local moves practical in CV spaces of tens to hundreds of dimensions. Readers would care because the approach keeps the benefits of non-local jumps without sacrificing correctness or requiring changes to the target distribution.

Core claim

We generalize these approaches and explicitly spell out an algorithm for non-linear CVs and underdamped Langevin dynamics. We prove reversibility of the resulting scheme and demonstrate its performance on several numerical examples, observing a substantial performance increase compared to methods based on overdamped Langevin dynamics as considered previously. Advances in generative machine-learning-based proposal samplers now enable efficient sampling in CV spaces of intermediate dimensionality (tens to hundreds of variables), and our results extend their applicability toward more realistic molecular systems.

What carries the argument

Generalized non-local Metropolis-Hastings proposal scheme in collective variable space, adapted for non-linear mappings and underdamped Langevin dynamics while enforcing reversibility.

If this is right

The scheme applies to non-linear collective variables without loss of reversibility.
Detailed balance is satisfied, so the sampled distribution matches the target Boltzmann measure.
Numerical examples exhibit substantially faster exploration of metastable states than overdamped counterparts.
The method directly leverages efficient high-dimensional CV proposals from current generative models.
It remains valid for underdamped dynamics, allowing velocity information to be incorporated in the updates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same non-local CV framework could be paired with parallel tempering or other global moves to tackle even higher barriers.
Performance gains may scale to larger biomolecular systems where CV spaces reach hundreds of dimensions.
The reversibility proof technique might adapt to other stochastic integrators such as those with position-dependent friction.
Direct comparison on a protein-folding benchmark would test whether the speedup persists when the CV space is learned rather than hand-crafted.

Load-bearing premise

Generative machine-learning-based samplers can now efficiently generate proposals in collective variable spaces of intermediate dimensionality for realistic molecular systems.

What would settle it

Apply the algorithm to a metastable test system with known equilibrium distribution and measure either a deviation from that distribution or no reduction in mixing time relative to standard overdamped Langevin updates.

Figures

Figures reproduced from arXiv: 2512.16812 by Christoph Sch\"onle, Davide Carbone, Gabriel Stoltz, Marylou Gabri\'e, Tony Leli\`evre.

**Figure 2.** Figure 2: Typical driven transition path from one mode center [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Gaussian Tunnel marginals for the optimum perfor [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 6.** Figure 6: Two possible transition paths in ϕ 4 model from positive to negative values of ϕ with the same parameters as in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Typical configurations for the dimer in the compact [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Performance comparison for the dimer model in [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: Free energy profile for the dimer model obtained [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 10.** Figure 10: Open and closed state of the polymer (red) and [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 11.** Figure 11: Simulation of the polymer with a one-dimensional [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗

**Figure 13.** Figure 13: The optimal value of α2 is roughly the same for all choices of α1, whereas the optimal number of intermediate steps KT increases with α1. Clearly, the optimal performance is observed for the deterministic dynamics, namely α1 = 0. 10 2 10 4 KT(b) 10 2 10 1 10 0 2 Deterministic 1 = 0.00 10 2 10 4 KT(b) 1 = 0.25 10 2 10 4 KT(b) 1 = 0.50 10 2 10 4 KT(b) 1 = 0.75 10 2 10 4 KT(b) Overdamped 1 = 1.00 14 12 10 8 … view at source ↗

**Figure 14.** Figure 14: Acceptance rate for a fixed jump from z = 0 to ze = b for the Gaussian tunnel with the deterministic dynamics α1 = 0. The results shown here are the same as in the top left panel of [PITH_FULL_IMAGE:figures/full_fig_p027_14.png] view at source ↗

**Figure 15.** Figure 15: Simulation of the Gaussian tunnel example introduced in Section III A (with smaller dimension [PITH_FULL_IMAGE:figures/full_fig_p027_15.png] view at source ↗

**Figure 16.** Figure 16: Mode jump cost for the ϕ 4 model from Section III B for different parameters, estimated from running 20 MCMC chains over 10,000 iterations. The x-axis shows the number of steps for a steered schedule over the distance 2ϕ ∗ of the two modes 4. Dimer in a solvent In addition to [PITH_FULL_IMAGE:figures/full_fig_p028_16.png] view at source ↗

**Figure 17.** Figure 17: Mode jump cost for the Dimer model from Section III C for different parameters, estimated from running 20 MCMC [PITH_FULL_IMAGE:figures/full_fig_p028_17.png] view at source ↗

read the original abstract

Monte Carlo simulations are widely used to simulate complex molecular systems, but standard approaches suffer from metastability. Lately, the use of non-local proposal updates in a collective-variable (CV) space has been proposed in several works. Here, we generalize these approaches and explicitly spell out an algorithm for non-linear CVs and underdamped Langevin dynamics. We prove reversibility of the resulting scheme and demonstrate its performance on several numerical examples, observing a substantial performance increase compared to methods based on overdamped Langevin dynamics as considered previously. Advances in generative machine-learning-based proposal samplers now enable efficient sampling in CV spaces of intermediate dimensionality (tens to hundreds of variables), and our results extend their applicability toward more realistic molecular systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a reversible algorithm for non-local nonlinear CV proposals in underdamped Langevin dynamics and shows clear sampling gains over overdamped baselines.

read the letter

The main advance is an explicit algorithm that extends non-local collective-variable proposals to nonlinear maps and underdamped Langevin dynamics, together with a reversibility proof. Earlier work stayed in the overdamped setting; this version handles the full phase-space state and ties the proposals to current generative models that can work in CV spaces of tens to hundreds of dimensions. The numerical tests on several examples report a substantial efficiency increase compared with the overdamped versions, and those tests look independent of the theory.

Referee Report

1 major / 2 minor

Summary. The manuscript generalizes non-local collective-variable (CV) proposal updates from prior overdamped Langevin work to non-linear CV maps s(q) and underdamped Langevin dynamics. It spells out an explicit algorithm, proves reversibility of the resulting scheme, and reports substantial performance gains on several numerical examples relative to overdamped baselines. The work is motivated by recent generative ML samplers that can handle intermediate-dimensional CV spaces.

Significance. If the reversibility proof is complete and the reported speed-ups are reproducible, the method would meaningfully extend efficient CV-based Monte Carlo sampling to more realistic molecular dynamics, allowing use of underdamped integrators while retaining non-local proposals.

major comments (1)

[Reversibility proof] Reversibility proof (section detailing the underdamped generalization): the argument must explicitly construct the Metropolis-Hastings acceptance probability that incorporates the full phase-space measure, including the Jacobian of the non-linear CV map s(q) and the kinetic-energy term ½pᵀM⁻¹p after momentum resampling. The current sketch appears to address only position-space reversibility; without the combined ratio the stationary distribution is not guaranteed to be the correct canonical measure.

minor comments (2)

[Algorithm description] Algorithm 1 (or equivalent pseudocode): add explicit steps for momentum resampling and the full acceptance ratio computation, including how the kinetic factor is evaluated.
[Numerical results] Numerical examples: report effective sample sizes or integrated autocorrelation times rather than raw wall-clock speed-ups to allow direct comparison with overdamped baselines.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on our manuscript. We address the single major comment below and have revised the manuscript to strengthen the presentation of the reversibility argument.

read point-by-point responses

Referee: [Reversibility proof] Reversibility proof (section detailing the underdamped generalization): the argument must explicitly construct the Metropolis-Hastings acceptance probability that incorporates the full phase-space measure, including the Jacobian of the non-linear CV map s(q) and the kinetic-energy term ½pᵀM⁻¹p after momentum resampling. The current sketch appears to address only position-space reversibility; without the combined ratio the stationary distribution is not guaranteed to be the correct canonical measure.

Authors: We agree that the original presentation of the reversibility argument was too concise and did not make the full phase-space ratio explicit. In the revised manuscript we have expanded the derivation in the underdamped section to construct the Metropolis-Hastings acceptance probability step by step. The new text explicitly includes (i) the Jacobian determinant of the non-linear map s(q) arising from the change of variables in the position update and (ii) the ratio of the kinetic-energy factors ½pᵀM⁻¹p evaluated before and after independent momentum resampling from the Maxwell-Boltzmann distribution. The resulting acceptance probability is shown to satisfy detailed balance with respect to the full canonical measure exp(−βH(q,p)) on phase space. We believe this revision removes any ambiguity and confirms that the correct stationary distribution is preserved. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper generalizes prior CV-based Monte Carlo proposals to non-linear collective variables and underdamped Langevin dynamics, then supplies an explicit algorithm whose reversibility is established by a direct proof and whose efficiency gain is shown via independent numerical tests on concrete systems. No equation or claim reduces by construction to a fitted parameter renamed as a prediction, nor does any load-bearing step rest on a self-citation whose content is itself unverified within the paper. The derivation chain is therefore self-contained against external benchmarks (mathematical proof plus separate simulation results) and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard properties of Markov chains and Langevin dynamics without introducing new fitted parameters or postulated entities in the abstract description.

axioms (1)

standard math The non-local proposal updates preserve detailed balance when combined with the acceptance step for the generalized CV and dynamics
Invoked to establish reversibility of the resulting Markov chain

pith-pipeline@v0.9.0 · 5435 in / 1086 out tokens · 38037 ms · 2026-05-16T21:04:21.223945+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The critical slowing down in diffusion models
cond-mat.dis-nn 2026-05 conditional novelty 8.0

Diffusion models on the Gaussian O(n) model exhibit critical slowing down with shallow networks that deeper local score approximations can reduce to logarithmic training-time scaling.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · cited by 1 Pith paper

[1]

Introducing notation This subsection largely uses tools and results from Ref. 39. Let us introduce a number of quantities which we will need later on. A central quantity of interest is the free energyF : Rℓ → R associated with the target measureν and the collective variableξ, defined as (see (II.2) for the definition ofΣ(z)) F (z) = − 1 β log Z Σ(z) Z −1 ...

work page
[2]

Properties of the steered dynamics In this section, we state a number of properties of the RATTLE scheme introduced in (II.17)-(II.19). Since we follow dynamics for the energy modified by the Fixman term, in principle we would need to put a tilde on all associated measures and quantities, but we omit it here for the sake of readability and just consider t...

work page
[3]

1 ∧ exp(−βWn+1) ρ(eZn+1, Zn) ρ(Zn,eZn+1) !#! + E φ(Qn, Qn)

Proof of reversibility We are now in a position to prove Theorem II.4, namely that the algorithm presented in Section IIB is reversible with respect to the target probability measureν. Assume thatQn is distributed according toν. The aim is to prove that (Qn, Qn+1) has the same law as(Qn+1, Qn). In order to study the law of(Qn, Qn+1), let us consider a bou...

work page
[4]

(D.4) In terms of the parameters of the normalized algorithm introduce in Section IID, this parameter choice corresponds to setting α1 = 1

Overdamped limit For the choiceγ = 2and M = ∆t 2 (and thereforeσ = 2√β), the forward dynamics (D.1)-(D.3) simplify to overdamped Langevin dynamics    qk+1 CV = z(tk+1) qk+1 ⊥ = qk ⊥ + s 2∆t β Gk ⊥ − ∆t∇⊥V (qk). (D.4) In terms of the parameters of the normalized algorithm introduce in Section IID, this parameter choice corresponds to setting α1 = 1....

work page
[5]

Deterministic limit In the deterministic limit, we haveγ = σ = 0. The forward dynamics then simplify to    qk+1 CV = z(tk+1) qk+1 ⊥ = qk ⊥ + ∆t M pk ⊥ − ∆t2 2M ∇⊥V (qk) pk+1 CV = M vz(tk+1) pk+1 ⊥ = pk ⊥ − ∆t 2 ∇⊥V (qk) + ∇⊥V (qk+1) . (D.5) This corresponds to the deterministic dynamics for the CV coordinate and Verlet integration for t...

work page
[6]

We write the gradient of the reaction coordinate∇ξ(q) ∈ Rd×1 in the form ∇ξ(q) = 1 2w   q1 − q2 |q1 − q2| − q1 − q2 |q1 − q2| 0

Computation of the Fixman term Recall that we use the reaction coordinate defined in (III.6): ξ(q) = |q1 − q2| − r0 2w . We write the gradient of the reaction coordinate∇ξ(q) ∈ Rd×1 in the form ∇ξ(q) = 1 2w   q1 − q2 |q1 − q2| − q1 − q2 |q1 − q2| 0 ... 0   = 1 2w   e12 −e12 0 ... 0   with the normalized vectore12 = q1−q2 ...

work page
[7]

Lagrange multiplier for position constraint From (II.18), the enforcement of the position constraint can be rewritten as ( qk+1 = ˜qk+1 + ∆t M −1∇ξ(qk)λk+1/2, ξ(qk+1) = z(tk+1), (Cq) 25 with ˜qk+1 = qk + ∆t M −1pk+1/4 − M −1 ∆t2 2 ∇eV (qk). Inserting the first equation into the second and using the definition (III.6) ofξ leads to a quadratic equation forλ...

work page
[8]

We have F (z) = − 1 β log Z e−βV (q)δξ(q)−z(dq) = − 1 β log Z e−βV (q)|∇ξ(q)|−1σM Σ(z)(dq) = − 1 β log e−βVD(2wz+r0)(2wz + r0) + const

Free Dimer not interacting with solvent particles Considering the dimer model and the collective variable introduced in Section IIIC, the free energy can be computed analytically for the special case when the solvent particles do not interact with the dimer. We have F (z) = − 1 β log Z e−βV (q)δξ(q)−z(dq) = − 1 β log Z e−βV (q)|∇ξ(q)|−1σM Σ(z)(dq) = − 1 β...

work page
[9]

Gaussian tunnel To illustrate the optimal choice of parameters when running the algorithm on the example of the Gaussian tunnel from Section IIIA, we show the inverse mode jump cost (introduced in the main text) for different parameter values in Fig. 13. The optimal value ofα2 is roughly the same for all choices ofα1, whereas the optimal number of interme...

work page
[10]

With the same underlying probability distribution, we consider the CVξ(z) = tanh z b · b tanh(1)

Gaussian tunnel with a non-linear collective variable We consider here a version of the Gaussian tunnel introduced in Section IIIA in dimensiond = 10 with a non-linear collective variable. With the same underlying probability distribution, we consider the CVξ(z) = tanh z b · b tanh(1). Using the push-forward of the probability densityνCV(z) under this tra...

work page
[11]

4 we ran the algorithm on theϕ4 model from Section IIIB for additional values of α1 and show the associated mode-jump cost in Fig

ϕ4 Model To complement the results shown in Fig. 4 we ran the algorithm on theϕ4 model from Section IIIB for additional values of α1 and show the associated mode-jump cost in Fig. 16. 102 103 104 KT(2 * ) 10 5 10 4 10 3 10 2 2 Deterministic 1 = 0.00 102 103 104 KT(2 * ) 1 = 0.05 102 103 104 KT(2 * ) 1 = 0.25 102 103 104 KT(2 * ) 1 = 0.50 102 103 104 KT(2 ...

work page
[12]

8, we ran the algorithm on the dimer model for additional values ofα1

Dimer in a solvent In addition to Fig. 8, we ran the algorithm on the dimer model for additional values ofα1. The associated inverse mode-jump cost is shown in Fig. 17. Apart from the deterministic algorithm (α1 = 0), almost no mode switches were observed within the computational budget. 102 103 1/v 10 6 10 4 2 Deterministic 1 = 0.00 102 103 1/v 1 = 0.25 ...

work page