Accelerating Constrained Sampling: A Large Deviations Approach

Changwei Tu; Lingjiong Zhu; Xiaoyu Wang; Yingli Wang

arxiv: 2506.07816 · v3 · submitted 2025-06-09 · 📊 stat.ML · cs.LG· math.PR

Accelerating Constrained Sampling: A Large Deviations Approach

Yingli Wang , Changwei Tu , Xiaoyu Wang , Lingjiong Zhu This is my paper

Pith reviewed 2026-05-19 10:42 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.PR

keywords constrained samplinglarge deviation principlereflected Langevin dynamicsnon-reversible dynamicsskew-symmetric matrixasymptotic varianceMonte Carlo methodsmachine learning

0 comments

The pith

A skew-symmetric matrix satisfying A n = 0 on the boundary makes the large deviation rate for skew-reflected non-reversible Langevin dynamics strictly smaller than for reflected Langevin dynamics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a large deviation principle for the empirical measure of skew-reflected non-reversible Langevin dynamics when the skew-symmetric matrix is chosen so its product with the outward unit normal vanishes on the boundary. Explicit comparison of the resulting rate functions shows this choice produces a strictly smaller rate than standard reflected Langevin dynamics. The smaller rate implies faster long-time convergence to the target distribution and lower asymptotic variance of estimators. This matters for machine learning tasks that require sampling probability distributions supported on constrained domains, because quicker mixing reduces the number of steps needed for accurate Monte Carlo estimates. Numerical experiments with the corresponding discretized algorithm confirm the predicted performance gains.

Core claim

When the skew-symmetric matrix satisfies A n = 0 on the boundary, the large deviation rate function for the empirical measure of SRNLD is strictly smaller than that of RLD, implying faster convergence and reduced asymptotic variance. The authors establish the LDP under this boundary condition and characterize the rate functions to obtain the strict inequality.

What carries the argument

The skew-symmetric matrix A chosen to satisfy A n = 0 on the boundary, which modifies the non-reversible reflected dynamics to produce a strictly smaller large deviation rate function than reversible reflected dynamics.

If this is right

SRNLD reaches the target distribution in fewer steps than RLD in the long-time limit.
Estimators based on SRNLD have strictly lower asymptotic variance than those based on RLD.
Discretized SRNLMC algorithms using the proposed matrix outperform standard reflected Langevin Monte Carlo on constrained sampling tasks.
The same boundary condition can be used to accelerate sampling in any application that relies on reflected stochastic dynamics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The zero-product boundary condition might be portable to other classes of reflected or constrained stochastic processes to obtain similar rate-function reductions.
High-dimensional Bayesian inference problems with inequality constraints could see practical speed-ups from the reduced variance.
Direct comparison of convergence rates against underdamped or Hamiltonian Monte Carlo variants on the same constrained domains would quantify relative gains.
Relaxing the smoothness assumption on the domain while keeping the matrix condition could test how robust the rate improvement remains.

Load-bearing premise

The domain is sufficiently smooth and the reflected dynamics are well-posed so that the large deviation principle can be established under the zero-product boundary condition.

What would settle it

A concrete numerical run or analytic counterexample in which the large deviation rate for the empirical measure of SRNLD is not smaller than that of RLD, despite satisfying A n = 0 on the boundary, would falsify the acceleration result.

read the original abstract

The problem of sampling a target probability distribution on a constrained domain arises in many applications including machine learning. For constrained sampling, various Langevin algorithms such as projected Langevin Monte Carlo (PLMC), based on the discretization of reflected Langevin dynamics (RLD) and more generally skew-reflected non-reversible Langevin Monte Carlo (SRNLMC), based on the discretization of skew-reflected non-reversible Langevin dynamics (SRNLD), have been proposed and studied in the literature. This work focuses on the long-time behavior of SRNLD, where a skew-symmetric matrix is added to RLD. Although acceleration for SRNLD has been studied, it is not clear how one should design the skew-symmetric matrix in the dynamics to achieve good performance in practice. We establish a large deviation principle (LDP) for the empirical measure of SRNLD when the skew-symmetric matrix is chosen such that its product with the outward unit normal vector field on the boundary is zero. By explicitly characterizing the rate functions, we show that this choice of the skew-symmetric matrix accelerates the convergence to the target distribution compared to RLD and reduces the asymptotic variance. Numerical experiments for SRNLMC based on the proposed skew-symmetric matrix show superior performance, which validate the theoretical findings from the large deviations theory.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives an explicit boundary condition on the skew matrix that strictly improves the large-deviation rate for skew-reflected non-reversible dynamics over plain reflected Langevin, backed by numerics, though the well-posedness of the SDE under that condition is the part that needs the most checking.

read the letter

The key point is that when the skew-symmetric matrix A satisfies A n = 0 on the boundary, the large-deviation rate function for the empirical measure of SRNLD is strictly smaller than for standard RLD. This implies faster convergence to the target and lower asymptotic variance, and the authors turn that into a concrete design rule for the skew term plus some supporting simulations on SRNLMC.

Referee Report

2 major / 1 minor

Summary. The manuscript establishes a large deviation principle (LDP) for the empirical measure of skew-reflected non-reversible Langevin dynamics (SRNLD) on a constrained domain when the skew-symmetric matrix A satisfies A n = 0 on the boundary. It explicitly characterizes the associated rate function, proves that this rate function is strictly smaller than the one for standard reflected Langevin dynamics (RLD), and concludes that the choice accelerates convergence to the target measure while reducing asymptotic variance. The theoretical claims are illustrated by numerical experiments on the corresponding Monte Carlo discretization (SRNLMC).

Significance. If the LDP and the strict rate-function comparison hold, the work supplies a concrete design principle for skew matrices that improves long-time sampling performance on constrained domains, which is relevant to machine-learning tasks that require sampling from distributions supported on manifolds or with hard constraints. The explicit rate-function characterization via large-deviations theory offers a sharper diagnostic than standard ergodicity or spectral-gap arguments.

major comments (2)

[LDP theorem and its proof] The central LDP statement (abstract and the theorem establishing the LDP for SRNLD) rests on the well-posedness of the reflected SDE with the added skew drift under the boundary condition A n = 0. Standard reflected-Langevin theory does not automatically guarantee existence, uniqueness, and invariance of the target measure once a tangential component is introduced, because the local-time process at the boundary can be altered. No proof or reference is supplied for this step, which is load-bearing for every subsequent rate-function comparison.
[Rate-function comparison] The claim that the rate function of SRNLD is strictly smaller than that of RLD (the comparison result following the LDP) is stated under the zero-product boundary condition, yet the manuscript does not delineate the precise regularity assumptions on the domain and on the target density that are needed for the strict inequality to hold. Without these, it is unclear whether the acceleration conclusion survives on domains with corners or for densities that vanish at the boundary.

minor comments (1)

[Numerical experiments] The numerical section would benefit from an explicit description of how the skew-symmetric matrix is constructed from the normal vector in each experiment and from a statement of the step-size schedule used for both SRNLMC and the baseline PLMC.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The two major comments raise important points about well-posedness and the precise conditions for the rate-function comparison. We address each below and will incorporate the necessary clarifications and references into the revised manuscript.

read point-by-point responses

Referee: The central LDP statement (abstract and the theorem establishing the LDP for SRNLD) rests on the well-posedness of the reflected SDE with the added skew drift under the boundary condition A n = 0. Standard reflected-Langevin theory does not automatically guarantee existence, uniqueness, and invariance of the target measure once a tangential component is introduced, because the local-time process at the boundary can be altered. No proof or reference is supplied for this step, which is load-bearing for every subsequent rate-function comparison.

Authors: We agree that a clear justification of well-posedness is essential. Under the boundary condition A n = 0 the additional skew drift lies entirely in the tangent plane, so it does not alter the normal reflection mechanism or the local-time process. Consequently, existence, uniqueness, and invariance of the target measure follow from standard results on reflected SDEs with tangential Lipschitz perturbations (e.g., the framework of Lions and Sznitman or more recent extensions to non-reversible drifts). We will add a short remark immediately after the SDE definition, citing the relevant theorem and briefly sketching why the tangential condition preserves the reflection law. revision: yes
Referee: The claim that the rate function of SRNLD is strictly smaller than that of RLD (the comparison result following the LDP) is stated under the zero-product boundary condition, yet the manuscript does not delineate the precise regularity assumptions on the domain and on the target density that are needed for the strict inequality to hold. Without these, it is unclear whether the acceleration conclusion survives on domains with corners or for densities that vanish at the boundary.

Authors: The strict inequality between rate functions is derived under the standing assumptions that the domain is C^2-smooth (so that the boundary is locally a C^2 graph) and that the target density is positive and C^2 up to the boundary. These conditions ensure that the variational problem admits strictly cheaper paths when the skew term is present. We acknowledge that the comparison may fail or require additional technical work on domains with corners (where the reflection law is more delicate) or when the density vanishes at the boundary (which could affect the support of the invariant measure). In the revision we will state these regularity assumptions explicitly in the theorem statements and add a short paragraph discussing the limitations for non-smooth domains and vanishing densities. revision: yes

Circularity Check

0 steps flagged

LDP derivation for SRNLD under A n = 0 is self-contained; no reduction to fitted inputs or self-citations

full rationale

The paper states it establishes the LDP for the empirical measure of SRNLD precisely when the skew-symmetric matrix satisfies A n = 0 on the boundary, then explicitly characterizes the rate functions to compare against RLD. This is presented as a direct theoretical result on long-time behavior rather than a post-hoc fit or renaming. No self-citation chains, ansatzes imported from prior author work, or self-definitional loops are visible in the provided abstract or derivation outline. The well-posedness premise is an external assumption for applying LDP theory, not a circular reduction within the paper's own equations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard existence and regularity assumptions for reflected stochastic differential equations on a smooth bounded domain; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption The domain is smooth and the reflected Langevin dynamics are well-posed.
Required for the large deviation principle to hold under the stated boundary condition on the skew matrix.

pith-pipeline@v0.9.0 · 5769 in / 1340 out tokens · 41925 ms · 2026-05-19T10:42:03.214710+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We choose J such that its product with the inward unit normal vector field on the boundary is zero (J(x)n(x)=0 on ∂K). This simplifies the oblique boundary condition to the standard Neumann condition... LS u(x)=Δu(x)−⟨∇f(x),∇u(x)⟩, LA u(x)=−⟨J(x)∇f(x),∇u(x)⟩.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

I(ν)=sup{−∫(LJ u/u)dν} with decomposition into symmetric and skew-symmetric parts; IA(ν)≥0 implies acceleration.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.