Ergodicity of Langevin Dynamics and its Discretizations for Non-smooth Potentials
Pith reviewed 2026-05-25 08:34 UTC · model grok-4.3
The pith
Subgradient Langevin dynamics converge exponentially to the target Gibbs distribution for strongly convex but non-differentiable potentials, with geometrically ergodic discretizations satisfying the law of large numbers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For potentials U that are strongly convex but possibly non-differentiable, the subgradient Langevin dynamics are exponentially ergodic to the target density π(x) ∝ e^{-U(x)} in continuous time. Certain explicit and semi-implicit discretizations are geometrically ergodic, converge to π as the discretization step size tends to zero, and satisfy the law of large numbers, allowing consecutive iterates of the Markov chain to compute statistics of the stationary distribution.
What carries the argument
Subgradient Langevin dynamics, obtained by replacing the gradient of U in the drift of the overdamped Langevin equation with an element of its subdifferential.
If this is right
- The continuous subgradient dynamics contract exponentially toward π.
- The discrete schemes become arbitrarily accurate approximations to π for sufficiently small step sizes.
- A single long trajectory from any of the discrete schemes can be used to estimate expectations via simple averaging.
- The results apply directly to sampling problems in imaging where the potential is convex but non-smooth.
Where Pith is reading between the lines
- The same contraction arguments might adapt to other first-order stochastic processes whose drift is set-valued.
- In practice the law of large numbers reduces the need to run many independent chains, lowering total wall-clock time for Monte Carlo estimates.
- The geometric rates could be used to derive explicit error bounds that incorporate both discretization and mixing time.
Load-bearing premise
The potential U must be strongly convex.
What would settle it
A numerical experiment on a strongly convex non-differentiable U where the empirical distribution of the discretized chain fails to approach π as the step size is driven to zero, or where the chain exhibits no geometric convergence rate, would falsify the claims.
Figures
read the original abstract
This article is concerned with sampling from Gibbs distributions $\pi(x)\propto e^{-U(x)}$ using Markov chain Monte Carlo methods. In particular, we investigate Langevin dynamics in the continuous- and the discrete-time setting for such distributions with potentials $U(x)$ which are strongly-convex but possibly non-differentiable. We show that the corresponding subgradient Langevin dynamics are exponentially ergodic to the target density $\pi$ in the continuous setting and that certain explicit as well as semi-implicit discretizations are geometrically ergodic and approximate $\pi$ for vanishing discretization step size. Moreover, we prove that the discrete schemes satisfy the law of large numbers allowing to use consecutive iterates of a Markov chain in order to compute statistics of the stationary distribution posing a significant reduction of computational complexity in practice. Numerical experiments are provided confirming the theoretical findings and showcasing the practical relevance of the proposed methods in imaging applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proves that for strongly convex but possibly non-differentiable potentials U, the continuous-time subgradient Langevin dynamics are exponentially ergodic to the target Gibbs measure π; explicit and semi-implicit discretizations are geometrically ergodic, converge to π as the step size vanishes, and satisfy a law of large numbers that permits using consecutive chain iterates for Monte Carlo estimation. Numerical experiments on imaging problems are included to illustrate the results.
Significance. If the proofs are correct, the work provides a rigorous extension of ergodicity theory to non-smooth convex potentials that arise in total-variation or indicator-constrained sampling problems. The LLN result is practically useful because it removes the need for burn-in or thinning. The combination of continuous and discrete analysis plus reproducible numerics strengthens the contribution.
major comments (3)
- [§2.2, Theorem 3.1] §2.2 and Theorem 3.1: the strong monotonicity of the subdifferential ∂U is invoked to obtain the contraction in Wasserstein distance, but the argument requires verifying that the multi-valued inclusion remains well-posed and that the resolvent is single-valued or appropriately measurable; the current sketch does not explicitly cite the required measurability or selection theorem.
- [Theorem 4.3] Theorem 4.3 (geometric ergodicity of the semi-implicit scheme): the step-size restriction appears to depend on the strong-convexity modulus μ and the Lipschitz constant of the smooth part; it is unclear whether the bound remains uniform when the non-smooth part is only convex (e.g., an indicator function) rather than strongly convex.
- [§5] §5 (LLN): the proof of the law of large numbers for the discrete chain relies on geometric ergodicity plus a moment bound; the moment bound is stated to follow from the same Lyapunov function used for ergodicity, but the verification that the Lyapunov function works uniformly in the discretization parameter h is only sketched.
minor comments (2)
- [§2] Notation for the subdifferential is introduced in §2 but the precise definition of the subgradient Langevin SDE (equation (2.3)) should explicitly indicate the measurable selection used for the driving noise term.
- [Numerical experiments] Figure 1 and the imaging experiments: the caption should state the precise value of the regularization parameter and the discretization step size h used, to allow direct comparison with the theoretical rates.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive feedback on our manuscript. The comments highlight areas where additional rigor and clarification will strengthen the presentation. We address each major comment point-by-point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [§2.2, Theorem 3.1] §2.2 and Theorem 3.1: the strong monotonicity of the subdifferential ∂U is invoked to obtain the contraction in Wasserstein distance, but the argument requires verifying that the multi-valued inclusion remains well-posed and that the resolvent is single-valued or appropriately measurable; the current sketch does not explicitly cite the required measurability or selection theorem.
Authors: We agree that the sketch in §2.2 can be made fully rigorous by citing the appropriate background. Strong monotonicity of ∂U (which follows from μ-strong convexity of U) implies that the resolvent is single-valued. Well-posedness of the differential inclusion and measurability of selections are standard consequences of convex analysis. In the revision we will add a short paragraph in §2.2 referencing Theorem 8.1.3 of Aubin–Frankowska (Set-Valued Analysis) for measurable selection and noting that the strong-monotonicity assumption guarantees a unique absolutely continuous solution. This does not change the statement or proof of Theorem 3.1. revision: yes
-
Referee: [Theorem 4.3] Theorem 4.3 (geometric ergodicity of the semi-implicit scheme): the step-size restriction appears to depend on the strong-convexity modulus μ and the Lipschitz constant of the smooth part; it is unclear whether the bound remains uniform when the non-smooth part is only convex (e.g., an indicator function) rather than strongly convex.
Authors: The step-size condition in Theorem 4.3 is derived from the μ-strong convexity of the full potential U = f + g, where f is smooth and L-Lipschitz and g is convex (possibly non-strongly convex). When g is an indicator of a convex set, strong convexity is supplied entirely by f; the same algebraic estimates used in the proof continue to hold with the same μ and L, so the bound remains uniform in this case. We will insert a remark after Theorem 4.3 explicitly stating that the result applies verbatim when g is merely convex (including indicators) provided U itself is μ-strongly convex, and we will verify that no additional restriction on h arises. revision: yes
-
Referee: [§5] §5 (LLN): the proof of the law of large numbers for the discrete chain relies on geometric ergodicity plus a moment bound; the moment bound is stated to follow from the same Lyapunov function used for ergodicity, but the verification that the Lyapunov function works uniformly in the discretization parameter h is only sketched.
Authors: We acknowledge that the uniformity of the moment bound with respect to h deserves a more detailed argument. The Lyapunov function V constructed for geometric ergodicity satisfies a uniform drift condition for all h ≤ h0 (with h0 depending only on μ and L). In the revision we will expand the proof of the moment bound in §5 by explicitly tracking the constants and showing that E[V(X_k)] remains bounded by a constant independent of h (for h small) and of the iteration index k. This will make the application of the ergodic theorem for the LLN fully rigorous. revision: yes
Circularity Check
No circularity: direct mathematical proofs of ergodicity
full rationale
The paper establishes exponential ergodicity for the continuous subgradient Langevin dynamics and geometric ergodicity plus LLN for explicit/semi-implicit discretizations via contraction arguments in Wasserstein distance that rely on the strong-convexity assumption and subdifferential monotonicity. These are standard first-principles derivations from the SDE/inclusion properties and do not reduce by construction to fitted inputs, self-definitions, or load-bearing self-citations. The derivation chain is self-contained against external benchmarks such as existing ergodicity theory for convex potentials.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption U is strongly convex
- standard math Subdifferential of U exists and satisfies standard convex analysis properties
Reference graph
Works this paper leans on
-
[1]
doi: 10.1070/SM2002v193n07ABEH000665. F . Bolley, I. Gentil, and A. Guillin. Convergence to equilibrium in Wasserstein distance for Fokker– Planck equations. Journal of Functional Analysis, 263(8):2430–2457,
- [2]
-
[3]
doi: 10.1007/s10851-010-0251-1. Y. Chen, S. Chewi, A. Salim, and A. Wibisono. Improved analysis for a proximal algorithm for sampling. In Proceedings of Thirty Fifth Conference on Learning Theory , volume 178 of Pro- ceedings of Machine Learning Research , pages 2984–3014. PMLR, 02–05 Jul
-
[4]
doi: https://doi.org/10.1016/j.media.2022.102479
ISSN 1361-8415. doi: https://doi.org/10.1016/j.media.2022.102479. A. S. Dalalyan. Theoretical guarantees for approximate sampling from smooth and log-concave dens- ities. Journal of the Royal Statistical Society Series B: Statistical Methodology, 79(3):651–676,
- [5]
-
[6]
doi: 10.3150/bj/1066223276. L. Hodgkinson, R. Salomone, and F . Roosta. Implicit Langevin algorithms for sampling from log- concave densities. Journal of Machine Learning Research, 22(136):1–30,
-
[7]
doi: 10.1080/10618600.2020. 1811105. J. Liang and Y. Chen. A proximal algorithm for sampling from non-smooth potentials. In2022 Winter Simulation Conference (WSC), pages 3229–3240,
-
[8]
doi: 10.1109/WSC57314.2022.10015293. G. Luo, M. Blumenthal, M. Heide, and M. Uecker. Bayesian mri reconstruction with joint uncertainty estimation using diffusion models. Magnetic Resonance in Medicine , 90(1):295–311,
-
[9]
doi: https://doi.org/10.1002/mrm.29624. T . D. Luu, J. Fadili, and C. Chesneau. Sampling from non-smooth distributions through Langevin diffusion. Methodology and Computing in Applied Probability , 23(4):1173–1201,
-
[10]
URL https://proceedings.neurips.cc/paper_files/paper/ 2020/file/2779fda014fbadb761f67dd708c1325e-Paper.pdf. Y. Song, J. Sohl-Dickstein, D. P . Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456,
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[11]
doi: 10.1109/TSP .2019.2894825. M. Vono, N. Dobigeon, and P . Chainais. Asymptotically exact data augmentation: Models, properties, and algorithms. Journal of Computational and Graphical Statistics , 30(2):335–348,
work page doi:10.1109/tsp 2019
- [12]
- [13]
-
[14]
doi: 10.1109/TMI.2023.3311345. 26
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.