Nonasymptotic Convergence Rates for Plug-and-Play Methods With MMSE Denoisers
Pith reviewed 2026-05-18 03:34 UTC · model grok-4.3
The pith
The MMSE denoiser under Gaussian noise corresponds to a 1-weakly convex regularizer given by an upper Moreau envelope of the negative log-marginal density, which yields the first sublinear convergence rates for plug-and-play proximal grad
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex. Using this property, the authors derive the first sublinear convergence guarantee for PnP proximal gradient descent with an MMSE denoiser.
What carries the argument
Upper Moreau envelope of the negative log-marginal density, which serves as the explicit regularizer induced by the MMSE denoiser and establishes its 1-weak convexity.
If this is right
- PnP proximal gradient descent with an MMSE denoiser converges at a sublinear rate.
- The implicit regularizer induced by the MMSE denoiser can be recovered explicitly in one-dimensional synthetic studies.
- Deblurring and computed-tomography experiments display the predicted sublinear convergence behavior.
Where Pith is reading between the lines
- The same envelope representation might be used to analyze convergence of other plug-and-play algorithms such as ADMM if weak convexity can be established for those iterates.
- Approximate learned denoisers that stay close to the true MMSE operator could inherit approximate versions of the same rate guarantees.
- The explicit envelope form opens a route to designing new regularizers that mimic the MMSE structure in other inverse problems.
Load-bearing premise
The noise must be exactly Gaussian and the denoiser must be the precise MMSE operator without approximation or clipping.
What would settle it
A one-dimensional synthetic experiment that fails to recover a regularizer matching the upper Moreau envelope, or imaging experiments that exhibit faster than sublinear convergence under exact Gaussian noise and exact MMSE denoising, would falsify the claims.
Figures
read the original abstract
It is known that the minimum-mean-squared-error (MMSE) denoiser under Gaussian noise can be written as a proximal operator, which suffices for asymptotic convergence of plug-and-play (PnP) methods but does not reveal the structure of the induced regularizer or give convergence rates. We show that the MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex. Using this property, we derive (to the best of our knowledge) the first sublinear convergence guarantee for PnP proximal gradient descent with an MMSE denoiser. We validate the theory with a one-dimensional synthetic study that recovers the implicit regularizer. We also validate the theory with imaging experiments (deblurring and computed tomography), which exhibit the predicted sublinear behavior.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that the MMSE denoiser for unit-variance Gaussian noise equals the proximal mapping of the upper Moreau envelope of the negative log-marginal density. This structure implies that the induced regularizer is 1-weakly convex. Consequently, standard results for proximal gradient methods on 1-weakly convex objectives yield sublinear convergence rates for the PnP proximal gradient descent iteration. The paper supports these claims with a one-dimensional synthetic experiment that recovers the implicit regularizer and with deblurring and computed tomography experiments that exhibit the predicted sublinear behavior.
Significance. If the central derivation holds, this paper makes a notable contribution by providing the first sublinear nonasymptotic convergence guarantee for plug-and-play proximal gradient descent using MMSE denoisers. Previous results were limited to asymptotic convergence based on the proximal representation alone. The explicit link to the upper Moreau envelope and 1-weak convexity is a clean and useful insight that could facilitate further analysis of PnP methods. The synthetic validation is particularly strong as it directly recovers the regularizer, while the imaging results provide empirical support for the theory under the stated assumptions.
major comments (2)
- The proof that the regularizer is 1-weakly convex relies on showing that R(y) + (1/2)||y||² is the convex conjugate of g(x) - (1/2)||x||². This step is load-bearing for the sublinear rate; please ensure the conjugate relation is derived without additional assumptions on p_Y beyond those needed for the MMSE to exist.
- The sublinear rate is for the stationarity measure; it would be helpful to specify the exact form of the rate (e.g., O(1/k)) and whether the hidden constants are independent of the problem dimensions or noise level, as this impacts the nonasymptotic nature of the result.
minor comments (3)
- The phrase 'to the best of our knowledge' for the first sublinear guarantee is appropriate, but adding a sentence contrasting with existing asymptotic results would improve context.
- In the imaging experiments, specify the exact implementation of the MMSE denoiser (e.g., closed-form or numerical) to allow reproducibility, especially since the theory requires the exact operator.
- Define the upper Moreau envelope more clearly at first use, distinguishing it from the standard lower Moreau envelope used in proximal operators.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our manuscript and for the constructive comments. We are glad that the link between MMSE denoisers and 1-weakly convex regularizers via upper Moreau envelopes is viewed as a clean insight enabling the first sublinear nonasymptotic rates for PnP proximal gradient descent. We address each major comment below.
read point-by-point responses
-
Referee: The proof that the regularizer is 1-weakly convex relies on showing that R(y) + (1/2)||y||² is the convex conjugate of g(x) - (1/2)||x||². This step is load-bearing for the sublinear rate; please ensure the conjugate relation is derived without additional assumptions on p_Y beyond those needed for the MMSE to exist.
Authors: We confirm that the derivation relies solely on the minimal assumptions required for the MMSE denoiser to exist (i.e., that the marginal density p_Y yields a well-defined conditional expectation with finite second moment). The upper Moreau envelope of the negative log-marginal is introduced directly from the definition of the MMSE estimator, and the conjugate relation R(y) + (1/2)||y||² = sup_x { <x,y> - (g(x) - (1/2)||x||²) } follows from standard convex-analytic arguments without imposing any further regularity on p_Y. We will add an explicit remark after the statement of Theorem 3.2 clarifying these minimal assumptions in the revised manuscript. revision: yes
-
Referee: The sublinear rate is for the stationarity measure; it would be helpful to specify the exact form of the rate (e.g., O(1/k)) and whether the hidden constants are independent of the problem dimensions or noise level, as this impacts the nonasymptotic nature of the result.
Authors: We appreciate the request for greater precision. Theorem 4.1 establishes that the expected stationarity measure satisfies a bound of the form O(1/k) after k iterations of PnP proximal gradient descent. The hidden constants depend on the smoothness modulus of the data-fidelity term and the weak-convexity parameter (fixed at 1), but are independent of the noise level because the analysis is carried out for unit-variance Gaussian noise as stated in the problem setup. Dependence on dimension enters only through the Lipschitz constant of the gradient of the data term, which is problem-specific and may scale with dimension in high-dimensional imaging applications. We will explicitly state the O(1/k) rate and discuss the dependence of the constants in the revised version. revision: yes
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper begins from the externally known proximal-operator representation of the exact MMSE denoiser under Gaussian noise, then applies standard Moreau-envelope identities to exhibit an explicit regularizer R that is 1-weakly convex by construction of the convex conjugate. Standard sublinear stationarity rates for proximal-gradient methods on 1-weakly-convex objectives are then invoked. None of these steps reduces to a fitted parameter, a self-referential definition, or a load-bearing self-citation; the argument is a direct mathematical implication under the stated modeling assumptions and is therefore self-contained.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The MMSE denoiser under additive Gaussian noise is exactly the proximal operator of some regularizer.
- standard math The negative log-marginal density is a proper lower-semicontinuous function so that the upper Moreau envelope is well-defined.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We show that the MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex.
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leancostAlphaLog_fourth_deriv_at_zero unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
ϕ_MM SE(x) = σ² M^σ² f_Z(x) − σ² C_x0
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.