Nonasymptotic Convergence Rates for Plug-and-Play Methods With MMSE Denoisers

Henry Pritchard; Rahul Parhi

arxiv: 2510.27211 · v6 · submitted 2025-10-31 · 🧮 math.OC · eess.SP· stat.ML

Nonasymptotic Convergence Rates for Plug-and-Play Methods With MMSE Denoisers

Henry Pritchard , Rahul Parhi This is my paper

Pith reviewed 2026-05-18 03:34 UTC · model grok-4.3

classification 🧮 math.OC eess.SPstat.ML

keywords MMSE denoiserplug-and-playweak convexityMoreau envelopeconvergence ratesproximal gradient descentimage deblurringcomputed tomography

0 comments

The pith

The MMSE denoiser under Gaussian noise corresponds to a 1-weakly convex regularizer given by an upper Moreau envelope of the negative log-marginal density, which yields the first sublinear convergence rates for plug-and-play proximal grad

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that the minimum-mean-squared-error denoiser for Gaussian noise acts as the proximal operator of a regularizer that equals an upper Moreau envelope of the negative log-marginal density of the clean signal. The envelope structure directly implies that the regularizer is 1-weakly convex. With this property in hand, the authors obtain an explicit sublinear convergence rate for plug-and-play proximal gradient descent. A sympathetic reader cares because earlier analyses of plug-and-play methods only established asymptotic convergence; non-asymptotic rates give concrete information about how many iterations are needed in practice for tasks such as image deblurring and computed tomography.

Core claim

The MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex. Using this property, the authors derive the first sublinear convergence guarantee for PnP proximal gradient descent with an MMSE denoiser.

What carries the argument

Upper Moreau envelope of the negative log-marginal density, which serves as the explicit regularizer induced by the MMSE denoiser and establishes its 1-weak convexity.

If this is right

PnP proximal gradient descent with an MMSE denoiser converges at a sublinear rate.
The implicit regularizer induced by the MMSE denoiser can be recovered explicitly in one-dimensional synthetic studies.
Deblurring and computed-tomography experiments display the predicted sublinear convergence behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same envelope representation might be used to analyze convergence of other plug-and-play algorithms such as ADMM if weak convexity can be established for those iterates.
Approximate learned denoisers that stay close to the true MMSE operator could inherit approximate versions of the same rate guarantees.
The explicit envelope form opens a route to designing new regularizers that mimic the MMSE structure in other inverse problems.

Load-bearing premise

The noise must be exactly Gaussian and the denoiser must be the precise MMSE operator without approximation or clipping.

What would settle it

A one-dimensional synthetic experiment that fails to recover a regularizer matching the upper Moreau envelope, or imaging experiments that exhibit faster than sublinear convergence under exact Gaussian noise and exact MMSE denoising, would falsify the claims.

Figures

Figures reproduced from arXiv: 2510.27211 by Henry Pritchard, Rahul Parhi.

**Figure 1.** Figure 1: Calculated and learned regularizers for an MMSE denoiser under mixture-of-Gaussian priors with unit Gaussian noise. The calculated regularizer is [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Calculated and learned regularizers for an MMSE denoiser under mixture-of-Laplacian priors with unit Gaussian noise. The calculated regularizer is [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: MMSE denoiser, convex potential, and implicit MMSE regularizer for a Laplacian prior [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: PNP-PGD results on the MNIST dataset under Gaussian blur. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Stationary residual (41) over 50 iterations for PNP-PGD using an MMSE denoiser on the MNIST dataset under Gaussian blur [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 7.** Figure 7: Randomly selected 128 × 128 region of the Shepp–Logan phantom. Clean, noisy/blurred, and PnP-PGD reconstruction with an MMSE denoiser after 5 iterations. Gaussian noise; per-panel PSNR/SSIM are denoted [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 6.** Figure 6: Randomly selected 128 × 128 region of the Shepp–Logan phantom. Clean, noisy/blurred, and PnP-PGD reconstruction with an MMSE denoiser after 5 iterations. Gaussian noise; per-panel PSNR/SSIM are denoted. D. Architectural Details. We build upon the implementation of the Input Convex Neural Network (ICNN) in [22] to approximate the MMSE denoiser. The authors produced excellent results with softplus activation… view at source ↗

read the original abstract

It is known that the minimum-mean-squared-error (MMSE) denoiser under Gaussian noise can be written as a proximal operator, which suffices for asymptotic convergence of plug-and-play (PnP) methods but does not reveal the structure of the induced regularizer or give convergence rates. We show that the MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex. Using this property, we derive (to the best of our knowledge) the first sublinear convergence guarantee for PnP proximal gradient descent with an MMSE denoiser. We validate the theory with a one-dimensional synthetic study that recovers the implicit regularizer. We also validate the theory with imaging experiments (deblurring and computed tomography), which exhibit the predicted sublinear behavior.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives the first sublinear nonasymptotic convergence rate for PnP proximal gradient with exact MMSE denoisers by showing the induced regularizer is 1-weakly convex as an upper Moreau envelope.

read the letter

The main thing to know is that this paper shows how to get the first sublinear convergence rate for plug-and-play proximal gradient descent when the denoiser is the exact MMSE estimator under Gaussian noise. They do this by identifying the implicit regularizer as the upper Moreau envelope of the negative log-marginal density, which turns out to be 1-weakly convex. That link lets them apply standard results from weakly convex optimization, which earlier PnP papers apparently did not have for non-asymptotic guarantees. The one-dimensional synthetic experiment recovers the implicit regularizer as predicted, and the deblurring and CT experiments show the sublinear decay in practice. That is decent validation for a theory paper. The soft spots sit mostly in the modeling assumptions. The whole argument needs exact Gaussian noise and a precise MMSE operator; any trained network that only approximates the MMSE breaks the weak-convexity and the rate directly. The rate constants may also depend on quantities that are not easy to check in real data, though the abstract does not detail the dependence. Still, under the stated ideal conditions the logic holds up without obvious gaps. This is for researchers working on convergence analysis of PnP and related methods in imaging and inverse problems. Anyone who wants to move these algorithms from heuristics toward rates will find it useful. The math looks grounded enough that it should go to a serious referee rather than a desk reject. I would recommend sending it for peer review.

Referee Report

2 major / 3 minor

Summary. The manuscript claims that the MMSE denoiser for unit-variance Gaussian noise equals the proximal mapping of the upper Moreau envelope of the negative log-marginal density. This structure implies that the induced regularizer is 1-weakly convex. Consequently, standard results for proximal gradient methods on 1-weakly convex objectives yield sublinear convergence rates for the PnP proximal gradient descent iteration. The paper supports these claims with a one-dimensional synthetic experiment that recovers the implicit regularizer and with deblurring and computed tomography experiments that exhibit the predicted sublinear behavior.

Significance. If the central derivation holds, this paper makes a notable contribution by providing the first sublinear nonasymptotic convergence guarantee for plug-and-play proximal gradient descent using MMSE denoisers. Previous results were limited to asymptotic convergence based on the proximal representation alone. The explicit link to the upper Moreau envelope and 1-weak convexity is a clean and useful insight that could facilitate further analysis of PnP methods. The synthetic validation is particularly strong as it directly recovers the regularizer, while the imaging results provide empirical support for the theory under the stated assumptions.

major comments (2)

The proof that the regularizer is 1-weakly convex relies on showing that R(y) + (1/2)||y||² is the convex conjugate of g(x) - (1/2)||x||². This step is load-bearing for the sublinear rate; please ensure the conjugate relation is derived without additional assumptions on p_Y beyond those needed for the MMSE to exist.
The sublinear rate is for the stationarity measure; it would be helpful to specify the exact form of the rate (e.g., O(1/k)) and whether the hidden constants are independent of the problem dimensions or noise level, as this impacts the nonasymptotic nature of the result.

minor comments (3)

The phrase 'to the best of our knowledge' for the first sublinear guarantee is appropriate, but adding a sentence contrasting with existing asymptotic results would improve context.
In the imaging experiments, specify the exact implementation of the MMSE denoiser (e.g., closed-form or numerical) to allow reproducibility, especially since the theory requires the exact operator.
Define the upper Moreau envelope more clearly at first use, distinguishing it from the standard lower Moreau envelope used in proximal operators.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of our manuscript and for the constructive comments. We are glad that the link between MMSE denoisers and 1-weakly convex regularizers via upper Moreau envelopes is viewed as a clean insight enabling the first sublinear nonasymptotic rates for PnP proximal gradient descent. We address each major comment below.

read point-by-point responses

Referee: The proof that the regularizer is 1-weakly convex relies on showing that R(y) + (1/2)||y||² is the convex conjugate of g(x) - (1/2)||x||². This step is load-bearing for the sublinear rate; please ensure the conjugate relation is derived without additional assumptions on p_Y beyond those needed for the MMSE to exist.

Authors: We confirm that the derivation relies solely on the minimal assumptions required for the MMSE denoiser to exist (i.e., that the marginal density p_Y yields a well-defined conditional expectation with finite second moment). The upper Moreau envelope of the negative log-marginal is introduced directly from the definition of the MMSE estimator, and the conjugate relation R(y) + (1/2)||y||² = sup_x { <x,y> - (g(x) - (1/2)||x||²) } follows from standard convex-analytic arguments without imposing any further regularity on p_Y. We will add an explicit remark after the statement of Theorem 3.2 clarifying these minimal assumptions in the revised manuscript. revision: yes
Referee: The sublinear rate is for the stationarity measure; it would be helpful to specify the exact form of the rate (e.g., O(1/k)) and whether the hidden constants are independent of the problem dimensions or noise level, as this impacts the nonasymptotic nature of the result.

Authors: We appreciate the request for greater precision. Theorem 4.1 establishes that the expected stationarity measure satisfies a bound of the form O(1/k) after k iterations of PnP proximal gradient descent. The hidden constants depend on the smoothness modulus of the data-fidelity term and the weak-convexity parameter (fixed at 1), but are independent of the noise level because the analysis is carried out for unit-variance Gaussian noise as stated in the problem setup. Dependence on dimension enters only through the Lipschitz constant of the gradient of the data term, which is problem-specific and may scale with dimension in high-dimensional imaging applications. We will explicitly state the O(1/k) rate and discuss the dependence of the constants in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper begins from the externally known proximal-operator representation of the exact MMSE denoiser under Gaussian noise, then applies standard Moreau-envelope identities to exhibit an explicit regularizer R that is 1-weakly convex by construction of the convex conjugate. Standard sublinear stationarity rates for proximal-gradient methods on 1-weakly-convex objectives are then invoked. None of these steps reduces to a fitted parameter, a self-referential definition, or a load-bearing self-citation; the argument is a direct mathematical implication under the stated modeling assumptions and is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on the standard fact that the MMSE denoiser is a proximal operator for some regularizer, plus the definition of the upper Moreau envelope. No new free parameters or invented entities are introduced; the weak-convexity constant of 1 follows directly from envelope properties.

axioms (2)

domain assumption The MMSE denoiser under additive Gaussian noise is exactly the proximal operator of some regularizer.
Stated as known in the first sentence of the abstract; this is the starting point for the envelope construction.
standard math The negative log-marginal density is a proper lower-semicontinuous function so that the upper Moreau envelope is well-defined.
Implicit in the claim that the regularizer can be written as an upper Moreau envelope.

pith-pipeline@v0.9.0 · 5688 in / 1628 out tokens · 30537 ms · 2026-05-18T03:34:56.233495+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We show that the MMSE denoiser corresponds to a regularizer that can be written explicitly as an upper Moreau envelope of the negative log-marginal density, which in turn implies that the regularizer is 1-weakly convex.
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean costAlphaLog_fourth_deriv_at_zero unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ϕ_MM SE(x) = σ² M^σ² f_Z(x) − σ² C_x0

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages

[1]

Plug-and-play methods for magnetic resonance imaging: Using denoisers for image recovery,

R. Ahmad, C. A. Bouman, G. T. Buzzard, S. Chan, S. Liu, E. T. Reehorst, and P. Schniter, “Plug-and-play methods for magnetic resonance imaging: Using denoisers for image recovery,”IEEE signal processing magazine, vol. 37, no. 1, pp. 105–116, 2020

work page 2020
[2]

Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward– backward splitting, and regularized gauss–seidel methods,

H. Attouch, J. Bolte, and B. F. Svaiter, “Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward– backward splitting, and regularized gauss–seidel methods,”Mathematical programming, vol. 137, no. 1, pp. 91–129, 2013

work page 2013
[3]

Convex analysis and monotone operator theory in hilbert spaces,

H. H. Bauschke and P. L. Combettes, “Convex analysis and monotone operator theory in hilbert spaces,”CMS Books in Mathematics, Ouvrages de math ´ematiques de la SMC, 2011

work page 2011
[4]

Lasry-lions regularization and a lemma of ilmanen,

P. Bernard, “Lasry-lions regularization and a lemma of ilmanen,” Rendiconti del Seminario Matematico della Universit `a di Padova, vol. 124, pp. 221–229, 2010

work page 2010
[5]

Variable smoothing for weakly convex composite functions,

A. B ¨ohm and S. J. Wright, “Variable smoothing for weakly convex composite functions,”Journal of optimization theory and applications, vol. 188, no. 3, pp. 628–649, 2021

work page 2021
[6]

Compressed sensing using generative models,

A. Bora, A. Jalal, E. Price, and A. G. Dimakis, “Compressed sensing using generative models,” inInternational conference on machine learning. PMLR, 2017, pp. 537–546

work page 2017
[7]

Distributed optimization and statistical learning via the alternating direction method of multipliers,

S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Ecksteinet al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,”Foundations and Trends® in Machine learning, vol. 3, no. 1, pp. 1–122, 2011

work page 2011
[8]

Learning firmly nonexpansive operators,

K. Bredies, J. Chirinos-Rodriguez, and E. Naldi, “Learning firmly nonexpansive operators,”arXiv preprint arXiv:2407.14156, 2024

work page arXiv 2024
[9]

Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,

E. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 489–509, 2006

work page 2006
[10]

An algorithm for total variation minimization and applications,

A. Chambolle, “An algorithm for total variation minimization and applications,”Journal of Mathematical imaging and vision, vol. 20, no. 1, pp. 89–97, 2004

work page 2004
[11]

A first-order primal-dual algorithm for convex problems with applications to imaging,

A. Chambolle and T. Pock, “A first-order primal-dual algorithm for convex problems with applications to imaging,”Journal of mathematical imaging and vision, vol. 40, no. 1, pp. 120–145, 2011

work page 2011
[12]

Plug-and-play admm for image restoration: Fixed-point convergence and applications,

S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-play admm for image restoration: Fixed-point convergence and applications,”IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84–98, 2016

work page 2016
[13]

Solving monotone inclusions via compositions of nonexpansive averaged operators,

P. L. Combettes, “Solving monotone inclusions via compositions of nonexpansive averaged operators,”Optimization, vol. 53, no. 5-6, pp. 475–504, 2004

work page 2004
[14]

Proximal splitting methods in signal processing,

P. L. Combettes and J.-C. Pesquet, “Proximal splitting methods in signal processing,” inFixed-point algorithms for inverse problems in science and engineering. Springer, 2011, pp. 185–212

work page 2011
[15]

Signal recovery by proximal forward- backward splitting,

P. L. Combettes and V . R. Wajs, “Signal recovery by proximal forward- backward splitting,”Multiscale modeling & simulation, vol. 4, no. 4, pp. 1168–1200, 2005

work page 2005
[16]

Image denoising by sparse 3-d transform-domain collaborative filtering,

K. Dabov, A. Foi, V . Katkovnik, and K. Egiazarian, “Image denoising by sparse 3-d transform-domain collaborative filtering,”IEEE Transactions on image processing, vol. 16, no. 8, pp. 2080–2095, 2007

work page 2080
[17]

Compressed sensing,

D. L. Donoho, “Compressed sensing,”IEEE Transactions on information theory, vol. 52, no. 4, pp. 1289–1306, 2006

work page 2006
[18]

Learning of patch-based smooth-plus-sparse models for image reconstruction,

S. Ducotterd, S. Neumayer, and M. Unser, “Learning of patch-based smooth-plus-sparse models for image reconstruction,” inThe Second Conference on Parsimony and Learning, 2025

work page 2025
[19]

Multivariate fields of experts,

S. Ducotterd and M. Unser, “Multivariate fields of experts,”arXiv preprint arXiv:2508.06490, 2025

work page arXiv 2025
[20]

Tweedie’s formula and selection bias,

B. Efron, “Tweedie’s formula and selection bias,”Journal of the American Statistical Association, vol. 106, no. 496, pp. 1602–1614, 2011

work page 2011
[21]

H. Engl, M. Hanke, and A. Neubauer,Regularization of Inverse Problems, ser. Mathematics and Its Applications. Springer Netherlands, 1996

work page 1996
[22]

What’s in a prior? learned proximal networks for inverse problems,

Z. Fang, S. Buchanan, and J. Sulam, “What’s in a prior? learned proximal networks for inverse problems,”arXiv preprint arXiv:2310.14344, 2023

work page arXiv 2023
[23]

Plug-and-play ista converges with kernel denoisers,

R. G. Gavaskar and K. N. Chaudhury, “Plug-and-play ista converges with kernel denoisers,”IEEE Signal Processing Letters, vol. 27, pp. 610–614, 2020

work page 2020
[24]

A neural-network-based convex regularizer for inverse problems,

A. Goujon, S. Neumayer, P. Bohra, S. Ducotterd, and M. Unser, “A neural-network-based convex regularizer for inverse problems,”IEEE Transactions on Computational Imaging, vol. 9, pp. 781–795, 2023

work page 2023
[25]

Reconciling

R. Gribonval and P. Machart, “Reconciling ”priors” &; ”priors” without prejudice?” inAdvances in Neural Information Processing Systems, vol. 26, 2013. 10

work page 2013
[26]

A characterization of proximity operators,

R. Gribonval and M. Nikolova, “A characterization of proximity operators,” Journal of Mathematical Imaging and Vision, vol. 62, no. 6, pp. 773–789, 2020

work page 2020
[27]

Should penalized least squares regression be interpreted as maximum a posteriori estimation?

R. Gribonval, “Should penalized least squares regression be interpreted as maximum a posteriori estimation?”IEEE Transactions on Signal Processing, vol. 59, no. 5, pp. 2405–2410, 2011

work page 2011
[28]

Sur les probl `emes aux d ´eriv´ees partielles et leur significa- tion physique,

J. Hadamard, “Sur les probl `emes aux d ´eriv´ees partielles et leur significa- tion physique,”Princeton university bulletin, pp. 49–52, 1902

work page 1902
[29]

Ehrhardt, and Sebastian Neumayer

J. Hertrich, H. S. Wong, A. Denker, S. Ducotterd, Z. Fang, M. Haltmeier, ˇZ. Kereta, E. Kobler, O. Leong, M. S. Salehiet al., “Learning regularization functionals for inverse problems: A comparative study,” arXiv preprint arXiv:2510.01755, 2025

work page arXiv 2025
[30]

Proximal denoiser for convergent plug-and-play optimization with nonconvex regularization,

S. Hurault, A. Leclaire, and N. Papadakis, “Proximal denoiser for convergent plug-and-play optimization with nonconvex regularization,” inInternational Conference on Machine Learning. PMLR, 2022, pp. 9483–9505

work page 2022
[31]

Natural image denoising with convolutional networks,

V . Jain and S. Seung, “Natural image denoising with convolutional networks,”Advances in neural information processing systems, vol. 21, 2008

work page 2008
[32]

Dimension free optimization and non-convex optimiza- tion,

S. M. Kakade, “Dimension free optimization and non-convex optimiza- tion,” CSE 547 Machine Learning for Big Data Lecture Slides, University of Washington, 2018, lecture Notes, Spring 2018

work page 2018
[33]

Plug-and-play methods for integrating physical and learned models in computational imaging: Theory, algorithms, and applications,

U. S. Kamilov, C. A. Bouman, G. T. Buzzard, and B. Wohlberg, “Plug-and-play methods for integrating physical and learned models in computational imaging: Theory, algorithms, and applications,”IEEE Signal Processing Magazine, vol. 40, no. 1, pp. 85–97, 2023

work page 2023
[34]

Mnist handwritten digit database,

Y . LeCun, C. Cortes, and C. Burges, “Mnist handwritten digit database,” 2010

work page 2010
[35]

Accelerated proximal gradient methods for nonconvex programming,

H. Li and Z. Lin, “Accelerated proximal gradient methods for nonconvex programming,”Advances in neural information processing systems, vol. 28, 2015

work page 2015
[36]

Image restoration using total variation regularized deep image prior,

J. Liu, Y . Sun, X. Xu, and U. S. Kamilov, “Image restoration using total variation regularized deep image prior,” inICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Ieee, 2019, pp. 7715–7719

work page 2019
[37]

Tu-fg-207a-04: Overview of the low dose ct grand challenge,

C. McCollough, “Tu-fg-207a-04: Overview of the low dose ct grand challenge,”Medical Physics, vol. 43, pp. 3759–3760, 06 2016

work page 2016
[38]

Denoising: a powerful building block for imaging, inverse problems and machine learning,

P. Milanfar and M. Delbracio, “Denoising: a powerful building block for imaging, inverse problems and machine learning,”Philosophical Transactions A, vol. 383, no. 2299, p. 20240326, 2025

work page 2025
[39]

On the number of linear regions of deep neural networks,

G. Mont ´ufar, R. Pascanu, K. Cho, and Y . Bengio, “On the number of linear regions of deep neural networks,”Advances in neural information processing systems, vol. 27, 2014

work page 2014
[40]

Fonctions convexes duales et points proximaux dans un espace hilbertien,

J. J. Moreau, “Fonctions convexes duales et points proximaux dans un espace hilbertien,”Comptes rendus hebdomadaires des s ´eances de l’Acad´emie des sciences, vol. 255, pp. 2897–2899, 1962

work page 1962
[41]

Proximit ´e et dualit ´e dans un espace hilbertien,

J.-J. Moreau, “Proximit ´e et dualit ´e dans un espace hilbertien,”Bulletin de la Soci ´et´e math ´ematique de France, vol. 93, pp. 273–299, 1965

work page 1965
[42]

Fixed-point and objective convergence of plug-and-play algorithms,

P. Nair, R. G. Gavaskar, and K. N. Chaudhury, “Fixed-point and objective convergence of plug-and-play algorithms,”IEEE Transactions on Computational Imaging, vol. 7, pp. 337–348, 2021

work page 2021
[43]

Deep learning techniques for inverse problems in imaging,

G. Ongie, A. Jalal, C. A. Metzler, R. G. Baraniuk, A. G. Dimakis, and R. Willett, “Deep learning techniques for inverse problems in imaging,” IEEE Journal on Selected Areas in Information Theory, vol. 1, no. 1, pp. 39–56, 2020

work page 2020
[44]

The sparsity of cycle spinning for wavelet-based solutions of linear inverse problems,

R. Parhi and M. Unser, “The sparsity of cycle spinning for wavelet-based solutions of linear inverse problems,”IEEE Signal Processing Letters, vol. 30, pp. 568–572, 2023

work page 2023
[45]

Proximal algorithms,

N. Parikh, S. Boydet al., “Proximal algorithms,”Foundations and trends® in Optimization, vol. 1, no. 3, pp. 127–239, 2014

work page 2014
[46]

Convergence of nonconvex pnp-admm with mmse denoisers,

C. Park, S. Shoushtari, W. Gan, and U. S. Kamilov, “Convergence of nonconvex pnp-admm with mmse denoisers,” in2023 IEEE 9th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP). IEEE, 2023, pp. 511–515

work page 2023
[47]

Learning maximally monotone operators for image recovery,

J.-C. Pesquet, A. Repetti, M. Terris, and Y . Wiaux, “Learning maximally monotone operators for image recovery,”SIAM Journal on Imaging Sciences, vol. 14, no. 3, pp. 1206–1237, 2021

work page 2021
[48]

DEALing with image reconstruction: Deep attentive least squares,

M. Pourya, E. Kobler, M. Unser, and S. Neumayer, “DEALing with image reconstruction: Deep attentive least squares,” inInternational Conference on Machine Learning (ICML), 2025

work page 2025
[49]

On the expressive power of deep neural networks,

M. Raghu, B. Poole, J. Kleinberg, S. Ganguli, and J. Sohl-Dickstein, “On the expressive power of deep neural networks,” ininternational conference on machine learning. PMLR, 2017, pp. 2847–2854

work page 2017
[50]

Rockafellar,Convex Analysis, 1970, vol

R. Rockafellar,Convex Analysis, 1970, vol. 28

work page 1970
[51]

R. T. Rockafellar and R. J. Wets,Variational analysis. Springer, 1998

work page 1998
[52]

Nonlinear total variation based noise removal algorithms,

L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,”Physica D: nonlinear phenomena, vol. 60, no. 1-4, pp. 259–268, 1992

work page 1992
[53]

Plug-and- play methods provably converge with properly trained denoisers,

E. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, and W. Yin, “Plug-and- play methods provably converge with properly trained denoisers,” in International Conference on Machine Learning. PMLR, 2019, pp. 5546–5557

work page 2019
[54]

The fourier reconstruction of a head section,

L. A. Shepp and B. F. Logan, “The fourier reconstruction of a head section,”IEEE Transactions on nuclear science, vol. 21, no. 3, pp. 21–43, 1974

work page 1974
[55]

A bound for the error in the normal approximation to the distribution of a sum of dependent random variables,

C. Stein, “A bound for the error in the normal approximation to the distribution of a sum of dependent random variables,” inProceedings of the sixth Berkeley symposium on mathematical statistics and probability, volume 2: Probability theory, vol. 6. University of California Press, 1972, pp. 583–603

work page 1972
[56]

An online plug-and-play algorithm for regularized image reconstruction,

Y . Sun, B. Wohlberg, and U. S. Kamilov, “An online plug-and-play algorithm for regularized image reconstruction,”IEEE Transactions on Computational Imaging, vol. 5, no. 3, pp. 395–408, 2019

work page 2019
[57]

Building firmly nonexpansive convolutional neural networks,

M. Terris, A. Repetti, J.-C. Pesquet, and Y . Wiaux, “Building firmly nonexpansive convolutional neural networks,” inICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 8658–8662

work page 2020
[58]

Plug-and-play priors for model based reconstruction,

S. V . Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” in2013 IEEE Global Conference on Signal and Information Processing, 2013, pp. 945–948

work page 2013
[59]

Provable convergence of plug-and-play priors with mmse denoisers,

X. Xu, Y . Sun, J. Liu, B. Wohlberg, and U. S. Kamilov, “Provable convergence of plug-and-play priors with mmse denoisers,”IEEE Signal Processing Letters, vol. 27, pp. 1280–1284, 2020

work page 2020
[60]

Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,

K. Zhang, W. Zuo, Y . Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,”IEEE transactions on image processing, vol. 26, no. 7, pp. 3142–3155, 2017

work page 2017
[61]

Low dose ct image denoising: A comparative study of deep learning models and training strategies,

H. Zhao, L. Qian, Y . Zhu, and D. Tian, “Low dose ct image denoising: A comparative study of deep learning models and training strategies,”AI Medicine, pp. 7–7, 2024

work page 2024

[1] [1]

Plug-and-play methods for magnetic resonance imaging: Using denoisers for image recovery,

R. Ahmad, C. A. Bouman, G. T. Buzzard, S. Chan, S. Liu, E. T. Reehorst, and P. Schniter, “Plug-and-play methods for magnetic resonance imaging: Using denoisers for image recovery,”IEEE signal processing magazine, vol. 37, no. 1, pp. 105–116, 2020

work page 2020

[2] [2]

Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward– backward splitting, and regularized gauss–seidel methods,

H. Attouch, J. Bolte, and B. F. Svaiter, “Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward– backward splitting, and regularized gauss–seidel methods,”Mathematical programming, vol. 137, no. 1, pp. 91–129, 2013

work page 2013

[3] [3]

Convex analysis and monotone operator theory in hilbert spaces,

H. H. Bauschke and P. L. Combettes, “Convex analysis and monotone operator theory in hilbert spaces,”CMS Books in Mathematics, Ouvrages de math ´ematiques de la SMC, 2011

work page 2011

[4] [4]

Lasry-lions regularization and a lemma of ilmanen,

P. Bernard, “Lasry-lions regularization and a lemma of ilmanen,” Rendiconti del Seminario Matematico della Universit `a di Padova, vol. 124, pp. 221–229, 2010

work page 2010

[5] [5]

Variable smoothing for weakly convex composite functions,

A. B ¨ohm and S. J. Wright, “Variable smoothing for weakly convex composite functions,”Journal of optimization theory and applications, vol. 188, no. 3, pp. 628–649, 2021

work page 2021

[6] [6]

Compressed sensing using generative models,

A. Bora, A. Jalal, E. Price, and A. G. Dimakis, “Compressed sensing using generative models,” inInternational conference on machine learning. PMLR, 2017, pp. 537–546

work page 2017

[7] [7]

Distributed optimization and statistical learning via the alternating direction method of multipliers,

S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Ecksteinet al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,”Foundations and Trends® in Machine learning, vol. 3, no. 1, pp. 1–122, 2011

work page 2011

[8] [8]

Learning firmly nonexpansive operators,

K. Bredies, J. Chirinos-Rodriguez, and E. Naldi, “Learning firmly nonexpansive operators,”arXiv preprint arXiv:2407.14156, 2024

work page arXiv 2024

[9] [9]

Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,

E. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 489–509, 2006

work page 2006

[10] [10]

An algorithm for total variation minimization and applications,

A. Chambolle, “An algorithm for total variation minimization and applications,”Journal of Mathematical imaging and vision, vol. 20, no. 1, pp. 89–97, 2004

work page 2004

[11] [11]

A first-order primal-dual algorithm for convex problems with applications to imaging,

A. Chambolle and T. Pock, “A first-order primal-dual algorithm for convex problems with applications to imaging,”Journal of mathematical imaging and vision, vol. 40, no. 1, pp. 120–145, 2011

work page 2011

[12] [12]

Plug-and-play admm for image restoration: Fixed-point convergence and applications,

S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-play admm for image restoration: Fixed-point convergence and applications,”IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84–98, 2016

work page 2016

[13] [13]

Solving monotone inclusions via compositions of nonexpansive averaged operators,

P. L. Combettes, “Solving monotone inclusions via compositions of nonexpansive averaged operators,”Optimization, vol. 53, no. 5-6, pp. 475–504, 2004

work page 2004

[14] [14]

Proximal splitting methods in signal processing,

P. L. Combettes and J.-C. Pesquet, “Proximal splitting methods in signal processing,” inFixed-point algorithms for inverse problems in science and engineering. Springer, 2011, pp. 185–212

work page 2011

[15] [15]

Signal recovery by proximal forward- backward splitting,

P. L. Combettes and V . R. Wajs, “Signal recovery by proximal forward- backward splitting,”Multiscale modeling & simulation, vol. 4, no. 4, pp. 1168–1200, 2005

work page 2005

[16] [16]

Image denoising by sparse 3-d transform-domain collaborative filtering,

K. Dabov, A. Foi, V . Katkovnik, and K. Egiazarian, “Image denoising by sparse 3-d transform-domain collaborative filtering,”IEEE Transactions on image processing, vol. 16, no. 8, pp. 2080–2095, 2007

work page 2080

[17] [17]

Compressed sensing,

D. L. Donoho, “Compressed sensing,”IEEE Transactions on information theory, vol. 52, no. 4, pp. 1289–1306, 2006

work page 2006

[18] [18]

Learning of patch-based smooth-plus-sparse models for image reconstruction,

S. Ducotterd, S. Neumayer, and M. Unser, “Learning of patch-based smooth-plus-sparse models for image reconstruction,” inThe Second Conference on Parsimony and Learning, 2025

work page 2025

[19] [19]

Multivariate fields of experts,

S. Ducotterd and M. Unser, “Multivariate fields of experts,”arXiv preprint arXiv:2508.06490, 2025

work page arXiv 2025

[20] [20]

Tweedie’s formula and selection bias,

B. Efron, “Tweedie’s formula and selection bias,”Journal of the American Statistical Association, vol. 106, no. 496, pp. 1602–1614, 2011

work page 2011

[21] [21]

H. Engl, M. Hanke, and A. Neubauer,Regularization of Inverse Problems, ser. Mathematics and Its Applications. Springer Netherlands, 1996

work page 1996

[22] [22]

What’s in a prior? learned proximal networks for inverse problems,

Z. Fang, S. Buchanan, and J. Sulam, “What’s in a prior? learned proximal networks for inverse problems,”arXiv preprint arXiv:2310.14344, 2023

work page arXiv 2023

[23] [23]

Plug-and-play ista converges with kernel denoisers,

R. G. Gavaskar and K. N. Chaudhury, “Plug-and-play ista converges with kernel denoisers,”IEEE Signal Processing Letters, vol. 27, pp. 610–614, 2020

work page 2020

[24] [24]

A neural-network-based convex regularizer for inverse problems,

A. Goujon, S. Neumayer, P. Bohra, S. Ducotterd, and M. Unser, “A neural-network-based convex regularizer for inverse problems,”IEEE Transactions on Computational Imaging, vol. 9, pp. 781–795, 2023

work page 2023

[25] [25]

Reconciling

R. Gribonval and P. Machart, “Reconciling ”priors” &; ”priors” without prejudice?” inAdvances in Neural Information Processing Systems, vol. 26, 2013. 10

work page 2013

[26] [26]

A characterization of proximity operators,

R. Gribonval and M. Nikolova, “A characterization of proximity operators,” Journal of Mathematical Imaging and Vision, vol. 62, no. 6, pp. 773–789, 2020

work page 2020

[27] [27]

Should penalized least squares regression be interpreted as maximum a posteriori estimation?

R. Gribonval, “Should penalized least squares regression be interpreted as maximum a posteriori estimation?”IEEE Transactions on Signal Processing, vol. 59, no. 5, pp. 2405–2410, 2011

work page 2011

[28] [28]

Sur les probl `emes aux d ´eriv´ees partielles et leur significa- tion physique,

J. Hadamard, “Sur les probl `emes aux d ´eriv´ees partielles et leur significa- tion physique,”Princeton university bulletin, pp. 49–52, 1902

work page 1902

[29] [29]

Ehrhardt, and Sebastian Neumayer

J. Hertrich, H. S. Wong, A. Denker, S. Ducotterd, Z. Fang, M. Haltmeier, ˇZ. Kereta, E. Kobler, O. Leong, M. S. Salehiet al., “Learning regularization functionals for inverse problems: A comparative study,” arXiv preprint arXiv:2510.01755, 2025

work page arXiv 2025

[30] [30]

Proximal denoiser for convergent plug-and-play optimization with nonconvex regularization,

S. Hurault, A. Leclaire, and N. Papadakis, “Proximal denoiser for convergent plug-and-play optimization with nonconvex regularization,” inInternational Conference on Machine Learning. PMLR, 2022, pp. 9483–9505

work page 2022

[31] [31]

Natural image denoising with convolutional networks,

V . Jain and S. Seung, “Natural image denoising with convolutional networks,”Advances in neural information processing systems, vol. 21, 2008

work page 2008

[32] [32]

Dimension free optimization and non-convex optimiza- tion,

S. M. Kakade, “Dimension free optimization and non-convex optimiza- tion,” CSE 547 Machine Learning for Big Data Lecture Slides, University of Washington, 2018, lecture Notes, Spring 2018

work page 2018

[33] [33]

Plug-and-play methods for integrating physical and learned models in computational imaging: Theory, algorithms, and applications,

U. S. Kamilov, C. A. Bouman, G. T. Buzzard, and B. Wohlberg, “Plug-and-play methods for integrating physical and learned models in computational imaging: Theory, algorithms, and applications,”IEEE Signal Processing Magazine, vol. 40, no. 1, pp. 85–97, 2023

work page 2023

[34] [34]

Mnist handwritten digit database,

Y . LeCun, C. Cortes, and C. Burges, “Mnist handwritten digit database,” 2010

work page 2010

[35] [35]

Accelerated proximal gradient methods for nonconvex programming,

H. Li and Z. Lin, “Accelerated proximal gradient methods for nonconvex programming,”Advances in neural information processing systems, vol. 28, 2015

work page 2015

[36] [36]

Image restoration using total variation regularized deep image prior,

J. Liu, Y . Sun, X. Xu, and U. S. Kamilov, “Image restoration using total variation regularized deep image prior,” inICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Ieee, 2019, pp. 7715–7719

work page 2019

[37] [37]

Tu-fg-207a-04: Overview of the low dose ct grand challenge,

C. McCollough, “Tu-fg-207a-04: Overview of the low dose ct grand challenge,”Medical Physics, vol. 43, pp. 3759–3760, 06 2016

work page 2016

[38] [38]

Denoising: a powerful building block for imaging, inverse problems and machine learning,

P. Milanfar and M. Delbracio, “Denoising: a powerful building block for imaging, inverse problems and machine learning,”Philosophical Transactions A, vol. 383, no. 2299, p. 20240326, 2025

work page 2025

[39] [39]

On the number of linear regions of deep neural networks,

G. Mont ´ufar, R. Pascanu, K. Cho, and Y . Bengio, “On the number of linear regions of deep neural networks,”Advances in neural information processing systems, vol. 27, 2014

work page 2014

[40] [40]

Fonctions convexes duales et points proximaux dans un espace hilbertien,

J. J. Moreau, “Fonctions convexes duales et points proximaux dans un espace hilbertien,”Comptes rendus hebdomadaires des s ´eances de l’Acad´emie des sciences, vol. 255, pp. 2897–2899, 1962

work page 1962

[41] [41]

Proximit ´e et dualit ´e dans un espace hilbertien,

J.-J. Moreau, “Proximit ´e et dualit ´e dans un espace hilbertien,”Bulletin de la Soci ´et´e math ´ematique de France, vol. 93, pp. 273–299, 1965

work page 1965

[42] [42]

Fixed-point and objective convergence of plug-and-play algorithms,

P. Nair, R. G. Gavaskar, and K. N. Chaudhury, “Fixed-point and objective convergence of plug-and-play algorithms,”IEEE Transactions on Computational Imaging, vol. 7, pp. 337–348, 2021

work page 2021

[43] [43]

Deep learning techniques for inverse problems in imaging,

G. Ongie, A. Jalal, C. A. Metzler, R. G. Baraniuk, A. G. Dimakis, and R. Willett, “Deep learning techniques for inverse problems in imaging,” IEEE Journal on Selected Areas in Information Theory, vol. 1, no. 1, pp. 39–56, 2020

work page 2020

[44] [44]

The sparsity of cycle spinning for wavelet-based solutions of linear inverse problems,

R. Parhi and M. Unser, “The sparsity of cycle spinning for wavelet-based solutions of linear inverse problems,”IEEE Signal Processing Letters, vol. 30, pp. 568–572, 2023

work page 2023

[45] [45]

Proximal algorithms,

N. Parikh, S. Boydet al., “Proximal algorithms,”Foundations and trends® in Optimization, vol. 1, no. 3, pp. 127–239, 2014

work page 2014

[46] [46]

Convergence of nonconvex pnp-admm with mmse denoisers,

C. Park, S. Shoushtari, W. Gan, and U. S. Kamilov, “Convergence of nonconvex pnp-admm with mmse denoisers,” in2023 IEEE 9th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP). IEEE, 2023, pp. 511–515

work page 2023

[47] [47]

Learning maximally monotone operators for image recovery,

J.-C. Pesquet, A. Repetti, M. Terris, and Y . Wiaux, “Learning maximally monotone operators for image recovery,”SIAM Journal on Imaging Sciences, vol. 14, no. 3, pp. 1206–1237, 2021

work page 2021

[48] [48]

DEALing with image reconstruction: Deep attentive least squares,

M. Pourya, E. Kobler, M. Unser, and S. Neumayer, “DEALing with image reconstruction: Deep attentive least squares,” inInternational Conference on Machine Learning (ICML), 2025

work page 2025

[49] [49]

On the expressive power of deep neural networks,

M. Raghu, B. Poole, J. Kleinberg, S. Ganguli, and J. Sohl-Dickstein, “On the expressive power of deep neural networks,” ininternational conference on machine learning. PMLR, 2017, pp. 2847–2854

work page 2017

[50] [50]

Rockafellar,Convex Analysis, 1970, vol

R. Rockafellar,Convex Analysis, 1970, vol. 28

work page 1970

[51] [51]

R. T. Rockafellar and R. J. Wets,Variational analysis. Springer, 1998

work page 1998

[52] [52]

Nonlinear total variation based noise removal algorithms,

L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,”Physica D: nonlinear phenomena, vol. 60, no. 1-4, pp. 259–268, 1992

work page 1992

[53] [53]

Plug-and- play methods provably converge with properly trained denoisers,

E. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, and W. Yin, “Plug-and- play methods provably converge with properly trained denoisers,” in International Conference on Machine Learning. PMLR, 2019, pp. 5546–5557

work page 2019

[54] [54]

The fourier reconstruction of a head section,

L. A. Shepp and B. F. Logan, “The fourier reconstruction of a head section,”IEEE Transactions on nuclear science, vol. 21, no. 3, pp. 21–43, 1974

work page 1974

[55] [55]

A bound for the error in the normal approximation to the distribution of a sum of dependent random variables,

C. Stein, “A bound for the error in the normal approximation to the distribution of a sum of dependent random variables,” inProceedings of the sixth Berkeley symposium on mathematical statistics and probability, volume 2: Probability theory, vol. 6. University of California Press, 1972, pp. 583–603

work page 1972

[56] [56]

An online plug-and-play algorithm for regularized image reconstruction,

Y . Sun, B. Wohlberg, and U. S. Kamilov, “An online plug-and-play algorithm for regularized image reconstruction,”IEEE Transactions on Computational Imaging, vol. 5, no. 3, pp. 395–408, 2019

work page 2019

[57] [57]

Building firmly nonexpansive convolutional neural networks,

M. Terris, A. Repetti, J.-C. Pesquet, and Y . Wiaux, “Building firmly nonexpansive convolutional neural networks,” inICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 8658–8662

work page 2020

[58] [58]

Plug-and-play priors for model based reconstruction,

S. V . Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” in2013 IEEE Global Conference on Signal and Information Processing, 2013, pp. 945–948

work page 2013

[59] [59]

Provable convergence of plug-and-play priors with mmse denoisers,

X. Xu, Y . Sun, J. Liu, B. Wohlberg, and U. S. Kamilov, “Provable convergence of plug-and-play priors with mmse denoisers,”IEEE Signal Processing Letters, vol. 27, pp. 1280–1284, 2020

work page 2020

[60] [60]

Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,

K. Zhang, W. Zuo, Y . Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,”IEEE transactions on image processing, vol. 26, no. 7, pp. 3142–3155, 2017

work page 2017

[61] [61]

Low dose ct image denoising: A comparative study of deep learning models and training strategies,

H. Zhao, L. Qian, Y . Zhu, and D. Tian, “Low dose ct image denoising: A comparative study of deep learning models and training strategies,”AI Medicine, pp. 7–7, 2024

work page 2024