Theoretical guarantees for lifted samplers

Florian Maire; Philippe Gagnon

arxiv: 2405.15952 · v3 · submitted 2024-05-24 · 📊 stat.CO · math.ST· stat.TH

Theoretical guarantees for lifted samplers

Philippe Gagnon , Florian Maire This is my paper

Pith reviewed 2026-05-24 01:17 UTC · model grok-4.3

classification 📊 stat.CO math.STstat.TH

keywords lifted samplersasymptotic varianceMCMCMetropolis-Hastingstheoretical guaranteesMarkov chains

0 comments

The pith

Lifted samplers keep asymptotic variance within a factor of two of the base sampler in general MCMC settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that for a broad class of lifted Markov chain Monte Carlo samplers the asymptotic variance is at most twice that of the original algorithm. This holds regardless of the target distribution or the method used to induce directions, and it covers derivations from Metropolis-Hastings, reversible jump, and similar base algorithms. The analysis applies Tierney's asymptotic variance expression to the augmented chain once the marginal on the original space is shown to match the base chain exactly. A sympathetic reader would care because the result quantifies the worst-case performance cost of lifting, indicating that substantial gains are possible while the downside remains limited.

Core claim

In a general framework for lifted samplers derived from various base algorithms like Metropolis-Hastings or reversible jump, the asymptotic variances cannot increase by a factor of more than 2, regardless of the target distribution and direction induction method. This follows from the marginal chain coinciding with the base and direct application of Tierney's asymptotic variance expression. The result indicates that while there is potentially a lot to gain from lifting a sampler, there is not much to lose.

What carries the argument

The general framework analysis that invokes Tierney's (1998) asymptotic variance expression on the augmented chain when the marginal chain on the original state space coincides exactly with the base algorithm.

If this is right

The asymptotic variance of any such lifted sampler is bounded above by twice the variance of the base sampler.
This bound is independent of the specific target distribution.
The guarantee extends to lifted versions of reversible jump algorithms and other Metropolis-Hastings variants.
Lifting can be applied with the assurance that variance will not more than double.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The bound might be checked for tightness through simulation on simple targets such as uniform distributions on integers.
Similar variance controls could be sought for other persistent or direction-augmented MCMC schemes.
The result encourages routine use of lifted versions when the base algorithm already performs adequately.

Load-bearing premise

The lifted sampler is constructed so that its marginal chain on the original state space coincides with the base algorithm and that Tierney's asymptotic variance expression applies directly to the augmented chain.

What would settle it

A concrete lifted sampler and target distribution where the asymptotic variance ratio exceeds two would falsify the claimed bound.

Figures

Figures reproduced from arXiv: 2405.15952 by Florian Maire, Philippe Gagnon.

**Figure 1.** Figure 1: Asymptotic variance of the MCMC estimator of the standardized version of the mapping x 7→ Pn i=1 xi for the MH algorithm, the lifted sampler and the reversible counterpart of the latter, all using the Barker proposal distribution when: (a) η increases from 10 to 50 and the other parameters are kept fixed (µ = 1 and λ = 0.5); (b) µ increases from 1 to 3.5 and the other parameters are kept fixed (η = 10 and … view at source ↗

**Figure 2.** Figure 2: Trace plots for the MH algorithm on the left panel and the lifted sampler on the right panel, both with the Barker proposal distribution. 1We do not claim optimality of our implementation, but we employed commonly used techniques for sampling from the conditional distributions Qν(x, ·) and computing the normalizing constant of the latter [PITH_FULL_IMAGE:figures/full_fig_p021_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the MH chain (left) and its lifted counterpart (right); the colour of the arrows indicates the level of the transition probabilities (darker is higher). Proposition 4. In the context of this example, let f be such that f(1) = 1 = −f(3) and f(2) = 0. Then, var(PLifted, f) = 2var(PMH, f) and var(PMH, f) is of order 1/ϵ. Given that the bound in Theorem 2 yields 2 = var(f, PLifted) var(f, PMH) … view at source ↗

read the original abstract

Lifted samplers form a class of Markov chain Monte Carlo methods which has drawn a lot attention in recent years due to superior performance in challenging Bayesian applications. A canonical example of lifted samplers is the one that is derived from a random walk Metropolis algorithm for a totally-ordered state space such as the integers or the real numbers. The lifted sampler is derived by splitting into two the proposal distribution: one part in the increasing direction, and the other part in the decreasing direction. It keeps following a direction, until a rejection occurs, upon which it flips the direction. In terms of asymptotic variances, it outperforms the random walk Metropolis algorithm, regardless of the target distribution, at no additional computational cost. Other studies show, however, that beyond this simple case, lifted samplers do not always outperform their Metropolis counterparts. In this paper, we leverage the celebrated work of Tierney (1998) to provide an analysis in a general framework encompassing a broad class of lifted samplers. Our finding is that, essentially, the asymptotic variances cannot increase by a factor of more than 2, regardless of the target distribution, the way the directions are induced, and the type of algorithm from which the lifted sampler is derived (be it a Metropolis--Hastings algorithm, a reversible jump algorithm, etc.). This result indicates that, while there is potentially a lot to gain from lifting a sampler, there is not much to lose.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper claims that lifted samplers, constructed so their marginal on the original space matches a base algorithm (MH, reversible jump, etc.), have asymptotic variance at most twice that of the base, regardless of target or direction induction. The bound is obtained by direct application of Tierney (1998) to the augmented (state + direction) chain in a general framework.

Significance. If the result holds, it supplies a useful, distribution-free guarantee that lifting cannot inflate asymptotic variance by more than a factor of two, while potentially offering gains at no extra cost. The manuscript earns credit for obtaining the factor-of-two bound via an external reference (Tierney 1998) without introducing free parameters, fitted quantities, or self-referential derivations.

major comments (1)

[General framework analysis invoking Tierney (1998)] The factor-of-two claim rests on Tierney (1998) applying directly to the augmented lifted chain. The general framework analysis invokes the variance formula without an explicit check that the expression remains valid for the non-reversible augmented process (persistent direction until rejection, then flip). Because lifted constructions are non-reversible by design, and because Tierney's resolvent/autocovariance forms are typically derived under reversibility or detailed balance, the manuscript must verify applicability for arbitrary direction induction and reversible-jump bases; this step is load-bearing for the central bound.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and for identifying this key point about the applicability of Tierney (1998). We address the major comment below.

read point-by-point responses

Referee: [General framework analysis invoking Tierney (1998)] The factor-of-two claim rests on Tierney (1998) applying directly to the augmented lifted chain. The general framework analysis invokes the variance formula without an explicit check that the expression remains valid for the non-reversible augmented process (persistent direction until rejection, then flip). Because lifted constructions are non-reversible by design, and because Tierney's resolvent/autocovariance forms are typically derived under reversibility or detailed balance, the manuscript must verify applicability for arbitrary direction induction and reversible-jump bases; this step is load-bearing for the central bound.

Authors: We appreciate the referee drawing attention to this. Tierney (1998) derives the central limit theorem and asymptotic variance expression via the resolvent kernel for general state-space Markov chains; the derivation relies only on ergodicity and does not invoke reversibility or detailed balance. The lifted sampler is constructed as a standard Markov chain on the augmented space (original variable plus direction), and the same ergodicity assumptions used for the base algorithm carry over directly. Consequently the variance formula applies verbatim, including for arbitrary direction-induction mechanisms and reversible-jump bases. To make this explicit we will add a short clarifying paragraph in Section 2 of the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: bound derived from external Tierney (1998) framework

full rationale

The paper's central result (asymptotic variance of lifted sampler at most twice that of the base) is obtained by invoking Tierney (1998) on the augmented chain under the explicit assumption that the marginal coincides with the base algorithm. No self-citation, fitted parameters renamed as predictions, self-definitional steps, or ansatz smuggling appear in the derivation chain. The reference is external and the construction is not tautological by the paper's own equations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the applicability of Tierney's asymptotic variance formula to the lifted construction and on the assumption that the marginal transition kernel remains unchanged. No free parameters or new entities are introduced.

axioms (1)

domain assumption Tierney (1998) asymptotic variance formula applies to the direction-augmented lifted chain.
The paper explicitly leverages this result for the general analysis.

pith-pipeline@v0.9.0 · 5776 in / 1272 out tokens · 28387 ms · 2026-05-24T01:17:59.127441+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

[1]

and Livingstone, S

Andrieu, C., Lee, A. and Livingstone, S. (2020) A general perspective on the Metropolis--Hastings kernel. arXiv:2012.14881

work page arXiv 2020
[2]

and Vihola, M

Andrieu, C., Lee, A. and Vihola, M. (2018) Uniform ergodicity of the iterated conditional SMC and geometric ergodicity of particle G ibbs samplers (supplemental content). Bernoulli, 24, 842--872

work page 2018
[3]

and Livingstone, S

Andrieu, C. and Livingstone, S. (2021) Peskun-- T ierney ordering for M arkovian M onte C arlo: Beyond the reversible scenario. Ann. Statist., 49, 1958 -- 1981

work page 2021
[4]

Barker, A. A. (1965) Monte C arlo calculations of the radial distribution functions for a proton-electron plasma. Austral. J. Phys., 18, 119--134

work page 1965
[5]

Beaumont, M. A. (2003) Estimation of population growth or decline in genetically monitored populations. Genetics, 164, 1139--1160

work page 2003
[6]

and Pak, I

Chen, F., Lov \'a sz, L. and Pak, I. (1999) Lifting M arkov chains to speed up mixing. In Proceedings of the thirty-first annual ACM symposium on Theory of computing, 275--281

work page 1999
[7]

and Neal, R

Diaconis, P., Holmes, S. and Neal, R. M. (2000) Analysis of a nonreversible M arkov chain sampler. Ann. Appl. Probab., 726--752

work page 2000
[8]

and Doucet, A

Gagnon, P. and Doucet, A. (2021) Nonreversible jump algorithms for B ayesian nested model selection. J. Comput. Graph. Statist., 30, 312--323. ArXiv:1911.01340

work page arXiv 2021
[9]

and Maire, F

Gagnon, P. and Maire, F. (2024) An asymptotic P eskun ordering and its application to lifted samplers. Bernoulli, 30, 2301 -- 2325

work page 2024
[10]

Green, P. J. (1995) Reversible jump M arkov chain M onte C arlo computation and B ayesian model determination. Biometrika, 82, 711--732

work page 1995
[11]

(1998) A guided walk M etropolis algorithm

Gustafson, P. (1998) A guided walk M etropolis algorithm. Stat. Comput., 8, 357--364

work page 1998
[12]

Hastings, W. K. (1970) Monte C arlo sampling methods using M arkov chains and their applications. Biometrika, 57, 97--109

work page 1970
[13]

and Rosenthal, J

Häggström, O. and Rosenthal, J. S. (2007) On Variance Conditions for Markov Chain CLTs . Electron. Commun. Probab., 12, 454 -- 464

work page 2007
[14]

Horowitz, A. M. (1991) A generalized guided M onte C arlo algorithm. Phys. Lett. B, 268, 247--252

work page 1991
[15]

S., Liang, F

Liu, J. S., Liang, F. and Wong, W. H. (2000) The multiple-try method and local optimization in M etropolis sampling. J. Amer. Statist. Assoc., 95, 121--134

work page 2000
[16]

and Zanella, G

Livingstone, S. and Zanella, G. (2022) The B arker proposal: combining robustness and efficiency in gradient-based MCMC . J. R. Stat. Soc. Ser. B. Stat. Methodol., 84, 496--523

work page 2022
[17]

W., Rosenbluth, M

Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953) Equation of state calculations by fast computing machines. J. Chem. Phys., 21, 1087

work page 1953
[18]

(1973) Optimum M onte- C arlo sampling using M arkov chains

Peskun, P. (1973) Optimum M onte- C arlo sampling using M arkov chains. Biometrika, 60, 607--612

work page 1973
[19]

Roberts, G. O. and Rosenthal, J. S. (2004) General state space M arkov chains and MCMC algorithms. Probab. Surv., 1, 20--71

work page 2004
[20]

and Hukushima, K

Sakai, Y. and Hukushima, K. (2016 a ) Eigenvalue analysis of an irreversible random walk with skew detailed balance conditions. Phys. Rev. E, 93, 043318

work page 2016
[21]

--- (2016 b ) Irreversible simulated tempering. J. Phys. Soc. Jpn., 85, 104002

work page 2016
[22]

and Doucet, A

Syed, S., Bouchard-C \^o t \'e , A., Deligiannidis, G. and Doucet, A. (2022) Non-reversible parallel tempering: A scalable highly parallel MCMC scheme. J. R. Stat. Soc. Ser. B. Stat. Methodol., 84, 321--350

work page 2022
[23]

(1998) A note on M etropolis- H astings kernels for general state spaces

Tierney, L. (1998) A note on M etropolis- H astings kernels for general state spaces. Ann. Appl. Probab., 8, 1--9

work page 1998
[24]

(2016) Lifting--a nonreversible Markov chain Monte Carlo algorithm

Vucelja, M. (2016) Lifting--a nonreversible Markov chain Monte Carlo algorithm. Amer. J. Phys., 84, 958--968

work page 2016
[25]

(2020) Informed proposals for local MCMC in discrete spaces

Zanella, G. (2020) Informed proposals for local MCMC in discrete spaces. J. Amer. Statist. Assoc., 115, 852--865

work page 2020

[1] [1]

and Livingstone, S

Andrieu, C., Lee, A. and Livingstone, S. (2020) A general perspective on the Metropolis--Hastings kernel. arXiv:2012.14881

work page arXiv 2020

[2] [2]

and Vihola, M

Andrieu, C., Lee, A. and Vihola, M. (2018) Uniform ergodicity of the iterated conditional SMC and geometric ergodicity of particle G ibbs samplers (supplemental content). Bernoulli, 24, 842--872

work page 2018

[3] [3]

and Livingstone, S

Andrieu, C. and Livingstone, S. (2021) Peskun-- T ierney ordering for M arkovian M onte C arlo: Beyond the reversible scenario. Ann. Statist., 49, 1958 -- 1981

work page 2021

[4] [4]

Barker, A. A. (1965) Monte C arlo calculations of the radial distribution functions for a proton-electron plasma. Austral. J. Phys., 18, 119--134

work page 1965

[5] [5]

Beaumont, M. A. (2003) Estimation of population growth or decline in genetically monitored populations. Genetics, 164, 1139--1160

work page 2003

[6] [6]

and Pak, I

Chen, F., Lov \'a sz, L. and Pak, I. (1999) Lifting M arkov chains to speed up mixing. In Proceedings of the thirty-first annual ACM symposium on Theory of computing, 275--281

work page 1999

[7] [7]

and Neal, R

Diaconis, P., Holmes, S. and Neal, R. M. (2000) Analysis of a nonreversible M arkov chain sampler. Ann. Appl. Probab., 726--752

work page 2000

[8] [8]

and Doucet, A

Gagnon, P. and Doucet, A. (2021) Nonreversible jump algorithms for B ayesian nested model selection. J. Comput. Graph. Statist., 30, 312--323. ArXiv:1911.01340

work page arXiv 2021

[9] [9]

and Maire, F

Gagnon, P. and Maire, F. (2024) An asymptotic P eskun ordering and its application to lifted samplers. Bernoulli, 30, 2301 -- 2325

work page 2024

[10] [10]

Green, P. J. (1995) Reversible jump M arkov chain M onte C arlo computation and B ayesian model determination. Biometrika, 82, 711--732

work page 1995

[11] [11]

(1998) A guided walk M etropolis algorithm

Gustafson, P. (1998) A guided walk M etropolis algorithm. Stat. Comput., 8, 357--364

work page 1998

[12] [12]

Hastings, W. K. (1970) Monte C arlo sampling methods using M arkov chains and their applications. Biometrika, 57, 97--109

work page 1970

[13] [13]

and Rosenthal, J

Häggström, O. and Rosenthal, J. S. (2007) On Variance Conditions for Markov Chain CLTs . Electron. Commun. Probab., 12, 454 -- 464

work page 2007

[14] [14]

Horowitz, A. M. (1991) A generalized guided M onte C arlo algorithm. Phys. Lett. B, 268, 247--252

work page 1991

[15] [15]

S., Liang, F

Liu, J. S., Liang, F. and Wong, W. H. (2000) The multiple-try method and local optimization in M etropolis sampling. J. Amer. Statist. Assoc., 95, 121--134

work page 2000

[16] [16]

and Zanella, G

Livingstone, S. and Zanella, G. (2022) The B arker proposal: combining robustness and efficiency in gradient-based MCMC . J. R. Stat. Soc. Ser. B. Stat. Methodol., 84, 496--523

work page 2022

[17] [17]

W., Rosenbluth, M

Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953) Equation of state calculations by fast computing machines. J. Chem. Phys., 21, 1087

work page 1953

[18] [18]

(1973) Optimum M onte- C arlo sampling using M arkov chains

Peskun, P. (1973) Optimum M onte- C arlo sampling using M arkov chains. Biometrika, 60, 607--612

work page 1973

[19] [19]

Roberts, G. O. and Rosenthal, J. S. (2004) General state space M arkov chains and MCMC algorithms. Probab. Surv., 1, 20--71

work page 2004

[20] [20]

and Hukushima, K

Sakai, Y. and Hukushima, K. (2016 a ) Eigenvalue analysis of an irreversible random walk with skew detailed balance conditions. Phys. Rev. E, 93, 043318

work page 2016

[21] [21]

--- (2016 b ) Irreversible simulated tempering. J. Phys. Soc. Jpn., 85, 104002

work page 2016

[22] [22]

and Doucet, A

Syed, S., Bouchard-C \^o t \'e , A., Deligiannidis, G. and Doucet, A. (2022) Non-reversible parallel tempering: A scalable highly parallel MCMC scheme. J. R. Stat. Soc. Ser. B. Stat. Methodol., 84, 321--350

work page 2022

[23] [23]

(1998) A note on M etropolis- H astings kernels for general state spaces

Tierney, L. (1998) A note on M etropolis- H astings kernels for general state spaces. Ann. Appl. Probab., 8, 1--9

work page 1998

[24] [24]

(2016) Lifting--a nonreversible Markov chain Monte Carlo algorithm

Vucelja, M. (2016) Lifting--a nonreversible Markov chain Monte Carlo algorithm. Amer. J. Phys., 84, 958--968

work page 2016

[25] [25]

(2020) Informed proposals for local MCMC in discrete spaces

Zanella, G. (2020) Informed proposals for local MCMC in discrete spaces. J. Amer. Statist. Assoc., 115, 852--865

work page 2020