pith. sign in

arxiv: 2606.19909 · v1 · pith:MFYYYS7Tnew · submitted 2026-06-18 · 📊 stat.CO · math.PR· stat.ME

Establishing an Ω(sqrt{d}) complexity lower bound for PDMP samplers and how to break it: a sub-sqrt{d} algorithm for Gaussian-tailed targets

Pith reviewed 2026-06-26 15:18 UTC · model grok-4.3

classification 📊 stat.CO math.PRstat.ME
keywords PDMP samplerscomplexity lower boundhigh-dimensional samplingGaussian-tailed targetsnon-reversible MCMCadaptive trajectoriescontinuous invariance
0
0 comments X

The pith

Standard PDMP samplers face an Ω(√d) complexity lower bound, but a new scheme that relaxes continuous invariance achieves O(d^α) scaling with α in [0.2, 0.3] for Gaussian-tailed targets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that Piecewise Deterministic Markov Process samplers cannot improve beyond Ω(√d) computational complexity in the usual setup where the target density must stay invariant at every continuous time. This invariance requirement had blocked all earlier PDMP methods from beating square-root scaling in dimension. By dropping the continuous-time invariance condition, the authors build a new PDMP algorithm that is locally adaptive in trajectory length and velocity-update spacing and that empirically needs only O(d^α) work with α between 0.2 and 0.3 on distributions whose tails decay like a Gaussian. The result matters because high-dimensional sampling appears in Bayesian statistics, machine learning, and physics simulation, where even modest reductions in the exponent on d can make previously intractable problems feasible.

Core claim

We prove that this is a fundamental limitation by establishing an Ω(√d) lower bound on the algorithmic complexity of PDMP samplers in a standard setup. By relaxing the assumption that the target density must remain invariant at all continuous times, we then demonstrate how to bypass this barrier. Specifically, we introduce a novel PDMP sampling scheme and show that it achieves an empirical complexity of O(d^α), where α ∈ [0.2, 0.3] for Gaussian-tailed targets. In addition, this PDMP scheme is locally adaptive in both trajectory length and distance between velocity updates.

What carries the argument

A novel PDMP sampling scheme that relaxes continuous-time target invariance while adapting locally to trajectory length and inter-update distances.

If this is right

  • Standard PDMP samplers cannot scale better than Ω(√d) under the continuous-invariance requirement.
  • The new scheme achieves empirical O(d^α) complexity with α ∈ [0.2, 0.3] on Gaussian-tailed targets.
  • The scheme adapts locally to both trajectory length and distance between velocity updates.
  • Relaxing continuous invariance is sufficient to bypass the proven lower bound.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same relaxation of invariance might be tested on other families of non-reversible samplers to check whether comparable dimension gains appear.
  • If the local adaptation rules generalize, the method could be tried on targets whose tails decay faster or slower than Gaussian.
  • In applied settings the reduced exponent would directly lower the dimension at which posterior sampling becomes practical for models with roughly Gaussian posteriors.

Load-bearing premise

The lower bound requires that the target density remain invariant at every continuous time.

What would settle it

A concrete counter-example would be any PDMP sampler that achieves better than √d scaling while keeping the target density invariant at all continuous times; for the new scheme, failure to observe O(d^α) with α < 0.5 on standard Gaussian-tailed test problems would challenge the bypass.

Figures

Figures reproduced from arXiv: 2606.19909 by Augustin Chevallier.

Figure 1
Figure 1. Figure 1: Three leapfrog steps in phase space for a 1D target. The discrete [PITH_FULL_IMAGE:figures/full_fig_p015_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Example trajectory for a 1D target. The idealized weight [PITH_FULL_IMAGE:figures/full_fig_p018_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The idealized weight w exact t (orange) and the corresponding MH pro￾posal distribution qY (blue) for a 1D target. 4.4 Algorithm Summary The full sampling procedure for a single iteration is detailed in Algorithm 1. Algorithm 1 Single Iteration of the Metropolized Oscillating PDMP Input: Current state (x0, v0) Output: Next state (xnext, vnext) 1. Initialize: Compute H0 = H(x0, v0) and define ˜µH0 p (scalin… view at source ↗
Figure 4
Figure 4. Figure 4: Scaling analysis for baseline targets with [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Robustness testing on targets with Non-Gaussian tails. Top: Logis￾tic target (heavier tails); Bottom: Quartic target (lighter tails). Each sub-figure displays Events per ESS, Number of GradientEvaluations per ESS, and Mean Acceptance Rate. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_5.png] view at source ↗
read the original abstract

Despite the theoretical appeal of their non-reversibility, to date, no Piecewise Deterministic Markov Process (PDMP) samplers have been developed that scale better than $\mathcal{O}(\sqrt{d})$ in computational complexity with respect to the target dimension $d$. We prove that this is a fundamental limitation by establishing an $\Omega(\sqrt{d})$ lower bound on the algorithmic complexity of PDMP samplers in a standard setup. By relaxing the assumption that the target density must remain invariant at all continuous times, we then demonstrate how to bypass this barrier. Specifically, we introduce a novel PDMP sampling scheme and show that it achieves an empirical complexity of $\mathcal{O}(d^\alpha)$, where $\alpha \in [0.2, 0.3]$ for Gaussian-tailed targets. In addition, this PDMP scheme is locally adaptive in both trajectory length and distance between velocity updates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proves an Ω(√d) lower bound on the algorithmic complexity of PDMP samplers under the standard setup requiring continuous-time invariance of the target density. It then introduces a novel PDMP scheme that relaxes this invariance, is locally adaptive in trajectory length and velocity-update distance, and empirically achieves O(d^α) complexity with α ∈ [0.2, 0.3] for Gaussian-tailed targets.

Significance. If the lower-bound proof is rigorous and the empirical complexity is counted on identical footing (same units of gradient/velocity-update cost per effective sample), the work would be significant: it both identifies a fundamental limit for invariant PDMPs and demonstrates a concrete way to circumvent it. The combination of a matching lower bound with a bypassing algorithm, plus the local adaptivity, strengthens the contribution.

major comments (3)
  1. [§3] §3 (lower-bound statement and proof): the definition of algorithmic complexity (velocity updates or gradient calls per effective sample) must be stated explicitly and shown to be identical to the metric used for the empirical scaling in §5; without this, the claim that the new scheme breaks the Ω(√d) bound cannot be verified.
  2. [§4.2] §4.2 (definition of the new scheme): the precise manner in which continuous-time invariance is relaxed is load-bearing for the bypass argument; the text must clarify whether the process remains a PDMP at every instant or only at discrete update times, and whether any additional overhead from the adaptation mechanism is charged to the complexity count.
  3. [§5] §5 (empirical complexity plots and tables): the reported α ∈ [0.2, 0.3] is obtained over which range of dimensions, and how is effective sample size computed (e.g., integrated autocorrelation time)? If the adaptation cost or trajectory-length overhead is omitted from the count, the apparent sub-√d scaling does not directly contradict the lower bound.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'standard setup' should be defined in one sentence so readers immediately understand the invariance assumption being relaxed.
  2. [§2] Notation: the symbol for effective sample size or the precise complexity unit should be introduced once in §2 and used consistently thereafter.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major comment below and indicate the revisions that will be made.

read point-by-point responses
  1. Referee: [§3] §3 (lower-bound statement and proof): the definition of algorithmic complexity (velocity updates or gradient calls per effective sample) must be stated explicitly and shown to be identical to the metric used for the empirical scaling in §5; without this, the claim that the new scheme breaks the Ω(√d) bound cannot be verified.

    Authors: We agree that an explicit definition is required for the lower-bound claim to be verifiable. In the revised manuscript we will add a dedicated paragraph in §3 that defines algorithmic complexity precisely as the number of velocity updates plus gradient evaluations per effective sample, and we will insert a sentence confirming that the identical metric is used for the empirical scaling reported in §5. revision: yes

  2. Referee: [§4.2] §4.2 (definition of the new scheme): the precise manner in which continuous-time invariance is relaxed is load-bearing for the bypass argument; the text must clarify whether the process remains a PDMP at every instant or only at discrete update times, and whether any additional overhead from the adaptation mechanism is charged to the complexity count.

    Authors: The scheme preserves the PDMP structure on each inter-update interval while relaxing continuous-time invariance of the target density so that invariance holds only at the discrete update instants. All costs incurred by the local adaptation rules (trajectory-length selection and velocity-update distances) are included in the complexity count. We will revise §4.2 to state these facts explicitly. revision: yes

  3. Referee: [§5] §5 (empirical complexity plots and tables): the reported α ∈ [0.2, 0.3] is obtained over which range of dimensions, and how is effective sample size computed (e.g., integrated autocorrelation time)? If the adaptation cost or trajectory-length overhead is omitted from the count, the apparent sub-√d scaling does not directly contradict the lower bound.

    Authors: We will add to §5 the exact range of dimensions over which the reported scaling was observed, confirm that effective sample size is obtained from the integrated autocorrelation time, and state that every adaptation and trajectory-length cost is charged to the complexity metric. These additions will make the empirical results directly comparable to the lower bound. revision: yes

Circularity Check

0 steps flagged

No significant circularity; lower bound and empirical bypass are independent.

full rationale

The paper states an Ω(√d) lower bound for PDMPs under the standard invariance-at-all-times assumption, then relaxes that assumption to introduce a new scheme with reported O(d^α) empirical scaling (α ∈ [0.2,0.3]). No equations, definitions, or self-citations are shown that reduce the lower bound or the complexity metric to a fitted quantity or prior result by construction. The central claims remain self-contained against external benchmarks; the relaxation explicitly addresses the bound's premise rather than redefining it.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claims rest on an unstated 'standard setup' for the lower bound and on empirical measurement of complexity for the new scheme; no free parameters, axioms, or invented entities are described in the abstract.

pith-pipeline@v0.9.1-grok · 5699 in / 1144 out tokens · 22927 ms · 2026-06-26T15:18:09.072568+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 2 linked inside Pith

  1. [1]

    Automated techniques for effi- cient sampling of piecewise-deterministic Markov processes.arXiv preprint arXiv:2408.03682, 2024

    Charly Andral and Kengo Kamatani. Automated techniques for effi- cient sampling of piecewise-deterministic Markov processes.arXiv preprint arXiv:2408.03682, 2024. 24

  2. [2]

    Optimal tuning of the hybrid Monte Carlo algorithm.Bernoulli, 19(5A):1501–1534, 2013

    Alexandros Beskos, Natesh Pillai, Gareth Roberts, Jesus-Maria Sanz- Serna, and Andrew Stuart. Optimal tuning of the hybrid Monte Carlo algorithm.Bernoulli, 19(5A):1501–1534, 2013

  3. [3]

    A conceptual introduction to Hamiltonian Monte Carlo.arXiv preprint arXiv:1701.02434, 2017

    Michael Betancourt. A conceptual introduction to Hamiltonian Monte Carlo.arXiv preprint arXiv:1701.02434, 2017

  4. [4]

    The zig-zag process and super-efficient sampling for Bayesian analysis of big data

    Joris Bierkens, Paul Fearnhead, and Gareth Roberts. The zig-zag process and super-efficient sampling for Bayesian analysis of big data. 2019

  5. [5]

    The boomerang sampler

    Joris Bierkens, Sebastiano Grazzi, Kengo Kamatani, and Gareth Roberts. The boomerang sampler. InInternational conference on machine learning, pages 908–918. PMLR, 2020

  6. [6]

    The within-orbit adaptive leapfrog no-u-turn sampler.arXiv preprint arXiv:2506.18746, 2025

    Nawaf Bou-Rabee, Bob Carpenter, Tore Selland Kleppe, and Sifan Liu. The within-orbit adaptive leapfrog no-u-turn sampler.arXiv preprint arXiv:2506.18746, 2025

  7. [7]

    Incorporating local step-size adaptivity into the no-u-turn sampler using gibbs self-tuning.The Journal of Chemical Physics, 163(8):084119, 08 2025

    Nawaf Bou-Rabee, Bob Carpenter, Tore Selland Kleppe, and Milo Marsden. Incorporating local step-size adaptivity into the no-u-turn sampler using gibbs self-tuning.The Journal of Chemical Physics, 163(8):084119, 08 2025

  8. [8]

    Gist: Gibbs self-tuning for locally adaptive hamiltonian monte carlo.arXiv preprint arXiv:2404.15253, 2025

    Nawaf Bou-Rabee, Bob Carpenter, and Milo Marsden. Gist: Gibbs self-tuning for locally adaptive hamiltonian monte carlo.arXiv preprint arXiv:2404.15253, 2025

  9. [9]

    The bouncy particle sampler: A nonreversible rejection-free Markov chain Monte Carlo method.Journal of the American Statistical Association, 113(522):855–867, 2018

    Alexandre Bouchard-Cˆ ot´ e, Sebastian J Vollmer, and Arnaud Doucet. The bouncy particle sampler: A nonreversible rejection-free Markov chain Monte Carlo method.Journal of the American Statistical Association, 113(522):855–867, 2018

  10. [10]

    Towards practical pdmp sampling: Metropolis adjustments, locally adaptive step-sizes, and nuts-based time lengths, 2025

    Augustin Chevallier, Sam Power, and Matthew Sutton. Towards practical pdmp sampling: Metropolis adjustments, locally adaptive step-sizes, and nuts-based time lengths, 2025

  11. [11]

    Automatic zig-zag sampling in practice.Statistics and Computing, 32(6):107, 2022

    Alice Corbella, Simon EF Spencer, and Gareth O Roberts. Automatic zig-zag sampling in practice.Statistics and Computing, 32(6):107, 2022

  12. [12]

    Mark H. A. Davis.Markov Models and Optimization, volume 49 ofMono- graphs on Statistics and Applied Probability. Chapman & Hall, London, 1993

  13. [13]

    Randomized Hamiltonian Monte Carlo as scaling limit of the bouncy particle sampler and dimension-free convergence rates.The Annals of Applied Probability, 31(6):2612 – 2662, 2021

    George Deligiannidis, Daniel Paulin, Alexandre Bouchard-Cˆ ot´ e, and Ar- naud Doucet. Randomized Hamiltonian Monte Carlo as scaling limit of the bouncy particle sampler and dimension-free convergence rates.The Annals of Applied Probability, 31(6):2612 – 2662, 2021

  14. [14]

    Piecewise deterministic Markov processes for continuous-time Monte Carlo

    Paul Fearnhead, Joris Bierkens, Murray Pollock, and Gareth O Roberts. Piecewise deterministic Markov processes for continuous-time Monte Carlo. Statistical Science, 33(3):386–412, 2018. 25

  15. [15]

    The no-u-turn sampler: adap- tively setting path lengths in Hamiltonian Monte Carlo.J

    Matthew D Hoffman, Andrew Gelman, et al. The no-u-turn sampler: adap- tively setting path lengths in Hamiltonian Monte Carlo.J. Mach. Learn. Res., 15(1):1593–1623, 2014

  16. [16]

    Liptser and Albert N

    Robert S. Liptser and Albert N. Shiryaev.Theory of Martingales, vol- ume 49. Springer, 1989

  17. [17]

    On time reversal of piecewise deterministic Markov processes.Electronic Journal of Probability, 18(1):1 – 29, 2013

    Andreas L¨ opker and Zbigniew Palmowski. On time reversal of piecewise deterministic Markov processes.Electronic Journal of Probability, 18(1):1 – 29, 2013

  18. [18]

    Martin, Oriol Abril-Pla, Jordan Deklerk, Seth D

    Osvaldo A. Martin, Oriol Abril-Pla, Jordan Deklerk, Seth D. Axen, Colin Carroll, Ari Hartikainen, and Aki Vehtari. Arviz: a modular and flexible library for exploratory analysis of bayesian models.Journal of Open Source Software, 11(119):9889, 2026

  19. [19]

    ATLAS: Adapting Trajectory Lengths and Step-Size for Hamiltonian Monte Carlo.arXiv preprint arXiv:2410.21587, 2024

    Chirag Modi. ATLAS: Adapting Trajectory Lengths and Step-Size for Hamiltonian Monte Carlo.arXiv preprint arXiv:2410.21587, 2024

  20. [20]

    Delayed rejection Hamilto- nian Monte Carlo for sampling multiscale distributions.Bayesian Analysis, 19(3):815 – 842, 2024

    Chirag Modi, Alex Barnett, and Bob Carpenter. Delayed rejection Hamilto- nian Monte Carlo for sampling multiscale distributions.Bayesian Analysis, 19(3):815 – 842, 2024

  21. [21]

    MCMC using Hamiltonian dynamics.Handbook of markov chain monte carlo, 2(11):2, 2011

    Radford M Neal et al. MCMC using Hamiltonian dynamics.Handbook of markov chain monte carlo, 2(11):2, 2011

  22. [22]

    Weak convergence and optimal scaling of random walk Metropolis algorithms.The annals of applied probability, 7(1):110–120, 1997

    Gareth O Roberts, Andrew Gelman, and Walter R Gilks. Weak convergence and optimal scaling of random walk Metropolis algorithms.The annals of applied probability, 7(1):110–120, 1997

  23. [23]

    Optimal scaling of discrete approximations to Langevin diffusions.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(1):255–268, 1998

    Gareth O Roberts and Jeffrey S Rosenthal. Optimal scaling of discrete approximations to Langevin diffusions.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(1):255–268, 1998

  24. [24]

    Concave-convex PDMP-based sam- pling.Journal of Computational and Graphical Statistics, 32(4):1425–1435, 2023

    Matthew Sutton and Paul Fearnhead. Concave-convex PDMP-based sam- pling.Journal of Computational and Graphical Statistics, 32(4):1425–1435, 2023

  25. [25]

    Sampling from multi- scale densities with delayed rejection generalized Hamiltonian Monte Carlo

    Gilad Turok, Chirag Modi, and Bob Carpenter. Sampling from multi- scale densities with delayed rejection generalized Hamiltonian Monte Carlo. arXiv preprint arXiv:2406.02741, 2024. A Deferred Proofs A.1 Complexity lower bound A.2 Detailed proof of Proposition 1 We start by two useful lemmas before the proof of the proposition itself. 26 Lemma 2(Lower Boun...