Proximal-IMH: Proximal Posterior Proposals for Independent Metropolis-Hastings with Approximate Operators
Pith reviewed 2026-05-21 11:32 UTC · model grok-4.3
The pith
Proximal correction of samples from a biased approximate posterior tightens the proposal and raises acceptance rates in independent Metropolis-Hastings sampling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Proximal-IMH removes bias from samples of an approximate posterior by solving an auxiliary optimization problem. This yields a local adjustment that trades off adherence to the exact model against stability around the approximate reference point. For idealized settings, the proximal correction tightens the match between approximate and exact posteriors, thereby improving acceptance rates and mixing. The method applies to both linear and nonlinear input-output operators and is particularly suitable for inverse problems where exact posterior sampling is too expensive.
What carries the argument
The proximal correction, an auxiliary optimization problem that performs a local adjustment on each sample drawn from the approximate posterior to improve alignment with the exact posterior.
Load-bearing premise
The method relies on the existence of an approximate posterior distribution that is cheaper to sample from but may have significant bias, and that the auxiliary proximal optimization problem can be solved reliably for the given operators.
What would settle it
Run Proximal-IMH on a simple linear inverse problem with known exact posterior and compare acceptance rates and effective sample size against standard independent Metropolis-Hastings without the proximal step; no improvement would contradict the claimed tightening effect.
read the original abstract
We consider the problem of sampling from a posterior distribution arising in Bayesian inverse problems in science, engineering, and imaging. Our method belongs to the family of independence Metropolis-Hastings (IMH) sampling algorithms, which are common in Bayesian inference. Relying on the existence of an approximate posterior distribution that is cheaper to sample from but may have significant bias, we introduce Proximal-IMH, a scheme that removes this bias by correcting samples from the approximate posterior through an auxiliary optimization problem. This yields a local adjustment that trades off adherence to the exact model against stability around the approximate reference point. For idealized settings, we prove that the proximal correction tightens the match between approximate and exact posteriors, thereby improving acceptance rates and mixing. The method applies to both linear and nonlinear input-output operators and is particularly suitable for inverse problems where exact posterior sampling is too expensive. We present numerical experiments including multimodal and data-driven priors with nonlinear input-output operators. The results show that Proximal-IMH reliably outperforms existing IMH variants.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Proximal-IMH, an independence Metropolis-Hastings sampler for Bayesian inverse problems. It draws proposals from a cheap but biased approximate posterior and applies a proximal correction via an auxiliary optimization problem to produce a local adjustment that trades off fidelity to the exact posterior against stability around the approximate reference. For idealized settings the authors prove that this correction tightens the match to the exact posterior and thereby improves acceptance rates and mixing; the method is claimed to apply to both linear and nonlinear forward operators. Numerical experiments on multimodal and data-driven priors with nonlinear operators are reported to show reliable outperformance over existing IMH variants.
Significance. If the idealized proof holds and the numerical gains are reproducible, the approach could supply a practical bias-correction mechanism for IMH when only approximate posteriors are cheaply available. The explicit proof for idealized cases together with the extension to nonlinear operators constitutes a clear strength; the method also avoids introducing new free parameters beyond the proximal regularization weight.
major comments (2)
- [idealized-settings proof] The proof that the proximal correction tightens the match to the exact posterior (abstract and the idealized-settings analysis) is derived under the assumption of an exact solve of the auxiliary proximal optimization problem. For nonlinear input-output operators this problem is non-convex; any practical solver tolerance or early stopping therefore risks violating the exactness assumption used to derive the acceptance-rate improvement. Please state the required accuracy of the proximal solve explicitly and analyze the effect of approximate solves on the acceptance probability.
- [numerical experiments] The central claim of improved mixing and acceptance rates rests on the proximal correction reducing bias relative to the approximate posterior. The manuscript should clarify whether the reported numerical gains remain when the proximal subproblem is solved only to moderate accuracy (e.g., 10^{-3} relative tolerance), as this directly tests the robustness of the theoretical guarantee for the nonlinear cases shown in the experiments.
minor comments (2)
- [method description] The role of the proximal regularization parameter is introduced but its selection strategy across the reported experiments is not detailed; a brief discussion or default rule would improve reproducibility.
- [algorithm] Notation for the approximate posterior and the proximal operator should be made consistent between the theoretical statements and the algorithmic pseudocode.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive feedback on our manuscript. We address the major comments point by point below and will revise the manuscript to incorporate the suggested clarifications and additional analysis.
read point-by-point responses
-
Referee: [idealized-settings proof] The proof that the proximal correction tightens the match to the exact posterior (abstract and the idealized-settings analysis) is derived under the assumption of an exact solve of the auxiliary proximal optimization problem. For nonlinear input-output operators this problem is non-convex; any practical solver tolerance or early stopping therefore risks violating the exactness assumption used to derive the acceptance-rate improvement. Please state the required accuracy of the proximal solve explicitly and analyze the effect of approximate solves on the acceptance probability.
Authors: We agree that the idealized proof relies on an exact solution of the proximal optimization problem. In the revised manuscript we will explicitly state the solver accuracy (in terms of relative residual tolerance) required to preserve the theoretical guarantees on acceptance-rate improvement. We will also add a short perturbation analysis showing that, under a bounded error in the proximal solution, the acceptance probability remains strictly higher than that of the uncorrected approximate posterior, with the improvement degrading gracefully as a function of the tolerance. Remarks on the implications for non-convex nonlinear operators will be included in the idealized-settings section. revision: yes
-
Referee: [numerical experiments] The central claim of improved mixing and acceptance rates rests on the proximal correction reducing bias relative to the approximate posterior. The manuscript should clarify whether the reported numerical gains remain when the proximal subproblem is solved only to moderate accuracy (e.g., 10^{-3} relative tolerance), as this directly tests the robustness of the theoretical guarantee for the nonlinear cases shown in the experiments.
Authors: We thank the referee for this important robustness check. In the revised numerical experiments section we will report results for the nonlinear-operator examples in which the proximal subproblem is solved only to a moderate accuracy of 10^{-3} relative tolerance. These additional runs confirm that the gains in acceptance rates and mixing are largely retained, albeit with a modest reduction relative to high-accuracy solves, thereby supporting the practical applicability of the method. revision: yes
Circularity Check
Derivation self-contained via new proximal correction and idealized proof
full rationale
The paper defines Proximal-IMH by introducing an auxiliary proximal optimization to correct samples from an approximate posterior, then proves in idealized settings that this correction tightens the match to the exact posterior (improving IMH acceptance). This proof and the supporting numerical experiments on linear/nonlinear operators constitute independent content; no parameter is fitted to data and then relabeled as a prediction, no self-citation chain bears the central claim, and no equation reduces to its own input by construction. The idealized proof's exact-solve assumption is a modeling choice, not a definitional loop.
Axiom & Free-Parameter Ledger
free parameters (1)
- proximal regularization parameter
axioms (1)
- domain assumption An approximate posterior that is cheaper to sample from exists and can be used to generate proposals.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
x = arg min ∥A(x)−eA(ex)∥² + β∥x−ex∥² (Eq. 3b); K=(AᵀA+βI)⁻¹(AᵀeA+βI) (Eq. 4)
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
mixing-time bounds via local Lipschitz constants of log-weights (Thm 3.3, A.3)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.