Nonasymptotic Convergence Rates for Plug-and-Play Methods With MMSE Denoisers

Henry Pritchard; Rahul Parhi

arxiv: 2510.27211 · v7 · pith:BVACLJH6new · submitted 2025-10-31 · 🧮 math.OC · eess.SP· stat.ML

Nonasymptotic Convergence Rates for Plug-and-Play Methods With MMSE Denoisers

Henry Pritchard , Rahul Parhi This is my paper

Pith reviewed 2026-05-18 03:34 UTC · model grok-4.3

classification 🧮 math.OC eess.SPstat.ML

keywords MMSE denoiserplug-and-playweak convexityMoreau envelopeconvergence ratesproximal gradient descentimage deblurringcomputed tomography

0 comments

The pith

The MMSE denoiser under Gaussian noise corresponds to a 1-weakly convex regularizer given by an upper Moreau envelope of the negative log-marginal density, which yields the first sublinear convergence rates for plug-and-play proximal grad

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that the minimum-mean-squared-error denoiser for Gaussian noise acts as the proximal operator of a regularizer that equals an upper Moreau envelope of the negative log-marginal density of the clean signal. The envelope structure directly implies that the regularizer is 1-weakly convex. With this property in hand, the authors obtain an explicit sublinear convergence rate for plug-and-play proximal gradient descent. A sympathetic reader cares because earlier analyses of plug-and-play methods only established asymptotic convergence; non-asymptotic rates give concrete information about how many iterations are needed in practice for tasks such as image deblurring and computed tomography.

Core claim

The MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex. Using this property, the authors derive the first sublinear convergence guarantee for PnP proximal gradient descent with an MMSE denoiser.

What carries the argument

Upper Moreau envelope of the negative log-marginal density, which serves as the explicit regularizer induced by the MMSE denoiser and establishes its 1-weak convexity.

If this is right

PnP proximal gradient descent with an MMSE denoiser converges at a sublinear rate.
The implicit regularizer induced by the MMSE denoiser can be recovered explicitly in one-dimensional synthetic studies.
Deblurring and computed-tomography experiments display the predicted sublinear convergence behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same envelope representation might be used to analyze convergence of other plug-and-play algorithms such as ADMM if weak convexity can be established for those iterates.
Approximate learned denoisers that stay close to the true MMSE operator could inherit approximate versions of the same rate guarantees.
The explicit envelope form opens a route to designing new regularizers that mimic the MMSE structure in other inverse problems.

Load-bearing premise

The noise must be exactly Gaussian and the denoiser must be the precise MMSE operator without approximation or clipping.

What would settle it

A one-dimensional synthetic experiment that fails to recover a regularizer matching the upper Moreau envelope, or imaging experiments that exhibit faster than sublinear convergence under exact Gaussian noise and exact MMSE denoising, would falsify the claims.

Figures

Figures reproduced from arXiv: 2510.27211 by Henry Pritchard, Rahul Parhi.

**Figure 1.** Figure 1: Calculated and learned regularizers for an MMSE denoiser under mixture-of-Gaussian priors with unit Gaussian noise. The calculated regularizer is [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Calculated and learned regularizers for an MMSE denoiser under mixture-of-Laplacian priors with unit Gaussian noise. The calculated regularizer is [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: MMSE denoiser, convex potential, and implicit MMSE regularizer for a Laplacian prior [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: PNP-PGD results on the MNIST dataset under Gaussian blur. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Stationary residual (41) over 50 iterations for PNP-PGD using an MMSE denoiser on the MNIST dataset under Gaussian blur [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 7.** Figure 7: Randomly selected 128 × 128 region of the Shepp–Logan phantom. Clean, noisy/blurred, and PnP-PGD reconstruction with an MMSE denoiser after 5 iterations. Gaussian noise; per-panel PSNR/SSIM are denoted [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 6.** Figure 6: Randomly selected 128 × 128 region of the Shepp–Logan phantom. Clean, noisy/blurred, and PnP-PGD reconstruction with an MMSE denoiser after 5 iterations. Gaussian noise; per-panel PSNR/SSIM are denoted. D. Architectural Details. We build upon the implementation of the Input Convex Neural Network (ICNN) in [22] to approximate the MMSE denoiser. The authors produced excellent results with softplus activation… view at source ↗

read the original abstract

It is known that the minimum-mean-squared-error (MMSE) denoiser under Gaussian noise can be written as a proximal operator, which suffices for asymptotic convergence of plug-and-play (PnP) methods but does not reveal the structure of the induced regularizer or give convergence rates. We show that the MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex. Using this property, we derive (to the best of our knowledge) the first sublinear convergence guarantee for PnP proximal gradient descent with an MMSE denoiser. We validate the theory with a one-dimensional synthetic study that recovers the implicit regularizer. We also validate the theory with imaging experiments (deblurring and computed tomography), which exhibit the predicted sublinear behavior.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives the first sublinear nonasymptotic convergence rate for PnP proximal gradient with exact MMSE denoisers by showing the induced regularizer is 1-weakly convex as an upper Moreau envelope.

read the letter

The main thing to know is that this paper shows how to get the first sublinear convergence rate for plug-and-play proximal gradient descent when the denoiser is the exact MMSE estimator under Gaussian noise. They do this by identifying the implicit regularizer as the upper Moreau envelope of the negative log-marginal density, which turns out to be 1-weakly convex. That link lets them apply standard results from weakly convex optimization, which earlier PnP papers apparently did not have for non-asymptotic guarantees. The one-dimensional synthetic experiment recovers the implicit regularizer as predicted, and the deblurring and CT experiments show the sublinear decay in practice. That is decent validation for a theory paper. The soft spots sit mostly in the modeling assumptions. The whole argument needs exact Gaussian noise and a precise MMSE operator; any trained network that only approximates the MMSE breaks the weak-convexity and the rate directly. The rate constants may also depend on quantities that are not easy to check in real data, though the abstract does not detail the dependence. Still, under the stated ideal conditions the logic holds up without obvious gaps. This is for researchers working on convergence analysis of PnP and related methods in imaging and inverse problems. Anyone who wants to move these algorithms from heuristics toward rates will find it useful. The math looks grounded enough that it should go to a serious referee rather than a desk reject. I would recommend sending it for peer review.

Referee Report

2 major / 3 minor

Summary. The manuscript claims that the MMSE denoiser for unit-variance Gaussian noise equals the proximal mapping of the upper Moreau envelope of the negative log-marginal density. This structure implies that the induced regularizer is 1-weakly convex. Consequently, standard results for proximal gradient methods on 1-weakly convex objectives yield sublinear convergence rates for the PnP proximal gradient descent iteration. The paper supports these claims with a one-dimensional synthetic experiment that recovers the implicit regularizer and with deblurring and computed tomography experiments that exhibit the predicted sublinear behavior.

Significance. If the central derivation holds, this paper makes a notable contribution by providing the first sublinear nonasymptotic convergence guarantee for plug-and-play proximal gradient descent using MMSE denoisers. Previous results were limited to asymptotic convergence based on the proximal representation alone. The explicit link to the upper Moreau envelope and 1-weak convexity is a clean and useful insight that could facilitate further analysis of PnP methods. The synthetic validation is particularly strong as it directly recovers the regularizer, while the imaging results provide empirical support for the theory under the stated assumptions.

major comments (2)

The proof that the regularizer is 1-weakly convex relies on showing that R(y) + (1/2)||y||² is the convex conjugate of g(x) - (1/2)||x||². This step is load-bearing for the sublinear rate; please ensure the conjugate relation is derived without additional assumptions on p_Y beyond those needed for the MMSE to exist.
The sublinear rate is for the stationarity measure; it would be helpful to specify the exact form of the rate (e.g., O(1/k)) and whether the hidden constants are independent of the problem dimensions or noise level, as this impacts the nonasymptotic nature of the result.

minor comments (3)

The phrase 'to the best of our knowledge' for the first sublinear guarantee is appropriate, but adding a sentence contrasting with existing asymptotic results would improve context.
In the imaging experiments, specify the exact implementation of the MMSE denoiser (e.g., closed-form or numerical) to allow reproducibility, especially since the theory requires the exact operator.
Define the upper Moreau envelope more clearly at first use, distinguishing it from the standard lower Moreau envelope used in proximal operators.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of our manuscript and for the constructive comments. We are glad that the link between MMSE denoisers and 1-weakly convex regularizers via upper Moreau envelopes is viewed as a clean insight enabling the first sublinear nonasymptotic rates for PnP proximal gradient descent. We address each major comment below.

read point-by-point responses

Referee: The proof that the regularizer is 1-weakly convex relies on showing that R(y) + (1/2)||y||² is the convex conjugate of g(x) - (1/2)||x||². This step is load-bearing for the sublinear rate; please ensure the conjugate relation is derived without additional assumptions on p_Y beyond those needed for the MMSE to exist.

Authors: We confirm that the derivation relies solely on the minimal assumptions required for the MMSE denoiser to exist (i.e., that the marginal density p_Y yields a well-defined conditional expectation with finite second moment). The upper Moreau envelope of the negative log-marginal is introduced directly from the definition of the MMSE estimator, and the conjugate relation R(y) + (1/2)||y||² = sup_x { <x,y> - (g(x) - (1/2)||x||²) } follows from standard convex-analytic arguments without imposing any further regularity on p_Y. We will add an explicit remark after the statement of Theorem 3.2 clarifying these minimal assumptions in the revised manuscript. revision: yes
Referee: The sublinear rate is for the stationarity measure; it would be helpful to specify the exact form of the rate (e.g., O(1/k)) and whether the hidden constants are independent of the problem dimensions or noise level, as this impacts the nonasymptotic nature of the result.

Authors: We appreciate the request for greater precision. Theorem 4.1 establishes that the expected stationarity measure satisfies a bound of the form O(1/k) after k iterations of PnP proximal gradient descent. The hidden constants depend on the smoothness modulus of the data-fidelity term and the weak-convexity parameter (fixed at 1), but are independent of the noise level because the analysis is carried out for unit-variance Gaussian noise as stated in the problem setup. Dependence on dimension enters only through the Lipschitz constant of the gradient of the data term, which is problem-specific and may scale with dimension in high-dimensional imaging applications. We will explicitly state the O(1/k) rate and discuss the dependence of the constants in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper begins from the externally known proximal-operator representation of the exact MMSE denoiser under Gaussian noise, then applies standard Moreau-envelope identities to exhibit an explicit regularizer R that is 1-weakly convex by construction of the convex conjugate. Standard sublinear stationarity rates for proximal-gradient methods on 1-weakly-convex objectives are then invoked. None of these steps reduces to a fitted parameter, a self-referential definition, or a load-bearing self-citation; the argument is a direct mathematical implication under the stated modeling assumptions and is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on the standard fact that the MMSE denoiser is a proximal operator for some regularizer, plus the definition of the upper Moreau envelope. No new free parameters or invented entities are introduced; the weak-convexity constant of 1 follows directly from envelope properties.

axioms (2)

domain assumption The MMSE denoiser under additive Gaussian noise is exactly the proximal operator of some regularizer.
Stated as known in the first sentence of the abstract; this is the starting point for the envelope construction.
standard math The negative log-marginal density is a proper lower-semicontinuous function so that the upper Moreau envelope is well-defined.
Implicit in the claim that the regularizer can be written as an upper Moreau envelope.

pith-pipeline@v0.9.0 · 5688 in / 1628 out tokens · 30537 ms · 2026-05-18T03:34:56.233495+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We show that the MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex.
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean costAlphaLog_fourth_deriv_at_zero unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ϕ_MM SE(x) = σ² M^σ² f_Z(x) − σ² C_x0

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.