Variable Bregman Majorization-Minimization algorithms for nonconvex nonsmooth optimization, with application to Poisson imaging
Pith reviewed 2026-05-10 14:44 UTC · model grok-4.3
The pith
A variable Bregman majorization-minimization framework converges to critical points for nonconvex nonsmooth problems under the Kurdyka-Lojasiewicz property without requiring Lipschitz smoothness.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A unifying Bregman-based majorization-minimization framework is introduced for nonconvex nonsmooth optimization. The framework leverages Bregman divergences, possibly varying across iterations, to construct tailored surrogate functions that majorize the objective. Convergence of the iterates to critical points is established under the Kurdyka-Lojasiewicz property, relaxing standard assumptions such as the Lipschitz smoothness of the nonconvex objective function. A constructive methodology is derived to build a broad class of variable Bregman majorants with tractable minimizers, encapsulating various existing majorization techniques, in particular those derived for Poisson data fidelity terms
What carries the argument
Variable Bregman majorants: surrogate functions derived from possibly iteration-dependent Bregman divergences that majorize the objective while admitting tractable minimization steps.
If this is right
- The algorithm converges to critical points for a wider class of nonconvex nonsmooth objectives that lack Lipschitz continuity.
- Existing majorization techniques for Poisson imaging inverse problems become special cases of the proposed framework.
- The scheme enables solving nonconvex regularized inverse problems in imaging without smoothness assumptions on the objective.
Where Pith is reading between the lines
- Allowing the Bregman divergence to change at each step may yield faster practical convergence than fixed-surrogate methods on the same problems.
- The constructive majorant design could be adapted to other data fidelity terms arising in different inverse problems.
- The framework opens the possibility of deriving convergence guarantees for additional nonconvex imaging regularizers beyond the PET example.
Load-bearing premise
The objective function satisfies the Kurdyka-Lojasiewicz property and suitable variable Bregman majorants with tractable minimizers can be constructed for the given problem.
What would settle it
A concrete nonconvex nonsmooth problem that satisfies the Kurdyka-Lojasiewicz property for which the constructed variable Bregman majorant produces iterates that fail to converge to any critical point.
Figures
read the original abstract
In this work, we introduce a unifying Bregman-based majorization-minimization (MM) framework for solving nonconvex nonsmooth optimization problems. The proposed approach leverages Bregman divergences, possibly varying across iterations, to construct tailored surrogate functions that majorize the objective. We establish the convergence of the iterates of the resulting variable Bregman MM algorithm to critical points under the Kurdyka-Lojasiewicz property, relaxing standard assumptions such as the Lipschitz smoothness of the nonconvex objective function. We derive a constructive methodology to build a broad class of variable Bregman majorants with tractable minimizers. Our study encapsulates various existing majorization techniques, in particular those derived for Poisson data fidelity terms in imaging inverse problems. Numerical experiments on Positron Emission Tomography (PET) image reconstruction with a nonconvex regularizer showcase the practical feasibility of the proposed scheme.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a unifying variable Bregman majorization-minimization (MM) framework for nonconvex nonsmooth optimization. Iteration-dependent Bregman divergences are used to construct surrogate functions that majorize the objective. Convergence of the resulting algorithm iterates to critical points is proved under the Kurdyka-Łojasiewicz property, relaxing the standard global Lipschitz smoothness assumption on the objective. A constructive methodology is derived for building a broad class of variable Bregman majorants whose minimizers are tractable; this class is shown to include existing majorization techniques for Poisson data-fidelity terms. The framework is applied to PET image reconstruction with a nonconvex regularizer and validated numerically.
Significance. If the convergence analysis and majorant construction hold, the work is significant for optimization and imaging applications. It unifies and extends MM methods by allowing variable Bregman surrogates without Lipschitz smoothness, provides an explicit construction recipe that recovers Poisson-specific techniques, and supplies reproducible numerical evidence on a practical inverse problem. These elements address a recurring practical need for flexible, provably convergent surrogates in nonconvex nonsmooth settings.
major comments (2)
- [§3.3, Theorem 3.4] §3.3, Theorem 3.4: the sufficient-decrease inequality (3.12) is stated for the variable Bregman surrogate, yet the proof only verifies the majorization property at the current iterate; it is not shown that the chosen variable parameter sequence remains admissible for all subsequent iterates without an additional uniform bound on the Bregman modulus.
- [§4.1, Eq. (4.7)] §4.1, Eq. (4.7): the explicit form of the variable Bregman majorant for the Poisson negative-log-likelihood is derived by linearizing the Hessian at the current point, but the resulting surrogate is claimed to be a global majorant; the verification that the remainder term is nonnegative for all x appears to rely on a local convexity argument that may fail when the current iterate is far from the minimizer.
minor comments (3)
- [§2.2] Notation for the variable Bregman divergence D_{φ_k} is introduced without an explicit statement of the dependence of φ_k on the iteration index k; a short clarifying sentence in §2.2 would remove ambiguity.
- [Numerical experiments] Figure 2 caption does not specify the number of independent runs or the precise stopping criterion used for the PET experiments; adding these details would improve reproducibility.
- [Introduction] The introduction cites several classical MM papers but omits recent variable-metric or adaptive-Bregman works (e.g., on proximal gradient with variable metrics); a brief comparison paragraph would strengthen the novelty claim.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [§3.3, Theorem 3.4] §3.3, Theorem 3.4: the sufficient-decrease inequality (3.12) is stated for the variable Bregman surrogate, yet the proof only verifies the majorization property at the current iterate; it is not shown that the chosen variable parameter sequence remains admissible for all subsequent iterates without an additional uniform bound on the Bregman modulus.
Authors: We appreciate the referee's observation regarding the admissibility of the variable parameter sequence. In our framework the Bregman modulus is chosen at each iteration to ensure majorization holds at the current point. To guarantee that the sufficient-decrease property propagates along the entire sequence, we will add a mild standing assumption that the sequence of Bregman moduli remains uniformly bounded (which is satisfied on compact level sets under the KL property). We will revise the statement of Theorem 3.4 and its proof in §3.3 to explicitly verify that the chosen parameters remain admissible for all subsequent iterates under this bound. This is a partial revision. revision: partial
-
Referee: [§4.1, Eq. (4.7)] §4.1, Eq. (4.7): the explicit form of the variable Bregman majorant for the Poisson negative-log-likelihood is derived by linearizing the Hessian at the current point, but the resulting surrogate is claimed to be a global majorant; the verification that the remainder term is nonnegative for all x appears to rely on a local convexity argument that may fail when the current iterate is far from the minimizer.
Authors: We thank the referee for this remark. The Poisson negative-log-likelihood is strictly convex with positive-definite Hessian. The variable Bregman majorant is obtained by replacing the Hessian with its value at the current iterate inside the Bregman divergence; because the Poisson term is convex, the difference between the true function and this surrogate is nonnegative for all x in the positive orthant. We will insert a short lemma in §4.1 that proves the global majorization property directly from the convexity and the explicit integral form of the remainder, without relying on proximity to a minimizer. The revised manuscript will contain this verification. revision: yes
Circularity Check
No significant circularity; derivation relies on external KL property
full rationale
The paper's core convergence result for variable Bregman MM iterates to critical points is established under the standard external Kurdyka-Lojasiewicz property, without any reduction to internally fitted parameters, self-definitional surrogates, or load-bearing self-citations. The constructive methodology for building majorants (including for Poisson terms) is presented as independent of the target result, and the framework unifies existing techniques without renaming known patterns as new derivations. No equations or steps reduce by construction to the paper's own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The objective function satisfies the Kurdyka-Lojasiewicz property
Forward citations
Cited by 3 Pith papers
-
Adaptive Regularization for Sparsity Control in Bregman-Based Optimizers
An adaptive regularization update for Bregman optimizers achieves target sparsity levels from 75% to 99% with faster early convergence and performance matching or exceeding oracle-tuned baselines.
-
Adaptive Regularization for Sparsity Control in Bregman-Based Optimizers
An adaptive lambda update based on current vs target sparsity enables reliable 75-99% sparsity in LinBreg and AdaBreg optimizers while matching or exceeding non-adaptive baseline performance on speaker verification tasks.
-
Adaptive Regularization for Sparsity Control in Bregman-Based Optimizers
Adaptive λ adjustment for target sparsity in LinBreg and AdaBreg, shown to work on speaker verification models with ECAPA-TDNN and ResNet34.
Reference graph
Works this paper leans on
-
[1]
H. Attouch and J. Bolte,On the convergence of the proximal algorithm for nonsmooth functions involving analytic features, Math. Program., 116 (2009), pp. 5–16
work page 2009
-
[2]
H. Attouch, J. Bolte, and B. F. Svaiter,Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods, Math. Program., Series A, 137 (2013), pp. 91–129
work page 2013
-
[3]
A. Banerjee, I. Dhillon, and J. Ghosh,Clustering with bregman divergences, Journal of Machine Learning Research, 6 (2004), pp. 1705–1749. 25 M.Adly, A.Chazottes, E.Chouzenoux, J.-C.Pesqet& F.Sureau
work page 2004
-
[4]
H. H. Bauschke, J. Bolte, and M. Teboulle,A descent lemma beyond Lipschitz gradient continuity: First-order methods revisited and applications, Math. Oper. Res., 42 (2016), pp. 330–348
work page 2016
-
[5]
H. H. Bauschke and J. M. Borwein,Legendre functions and the method of random Bregman projections, J. Convex Anal., 4 (1997), pp. 27–67
work page 1997
-
[6]
H. H. Bauschke and P. L. Combettes,Convex Analysis and Monotone Operator Theory in Hilbert Spaces, CMS Books in Mathematics / Ouvrages de math´ematiques de la SMC, Springer, Cham, Switzerland, second ed., 2017
work page 2017
-
[7]
A. Beck and M. Teboulle,Mirror descent and nonlinear projected subgradient methods for convex optimization, Op. Res. Lett., 31 (2003), pp. 167–175
work page 2003
-
[8]
M. Bertero, P. Boccacci, G. Desider`a, and G. Vicidomini,Image deblurring with Poisson data: from cells to galaxies, Inverse Problems, 25 (2009), p. 123006
work page 2009
-
[9]
M. Bertero, P. Boccacci, and V. Ruggiero,Inverse Imaging with Poisson Data, IOP Publishing, Bristol, UK, 2018
work page 2018
-
[10]
M. Bertero and C. De Mol,A simple convergence proof of the ML-EM algorithm in the presence of background emission, Ann. Univ. Ferrara, (2022), pp. 259–275
work page 2022
- [11]
- [12]
-
[13]
S. Bonettini, I. Loris, F. Porta, and M. Prato,Variable metric inexact line-search-based methods for nonsmooth optimization, SIAM J. Optim., 26 (2016), pp. 891–921
work page 2016
-
[14]
S. Bonettini, F. Porta, V. Ruggiero, and L. Zanni,Variable metric techniques for forward–backward methods in imaging, J. Comput. Appl. Math., 385 (2021), p. 113192
work page 2021
-
[15]
J. F. Bonnans, J.-C. Gilbert, C. Lemar´echal, and C. A. Sagastiz´abal,A family of variable metric proximal methods, Math. Program., 68 (1995), pp. 15–47
work page 1995
-
[16]
E. Chouzenoux, M. Corbineau, and J.-C. Pesqet,A proximal interior point algorithm with applications to image processing, J. Math. Imaging Vision, 62 (2020), pp. 919–940
work page 2020
-
[17]
E. Chouzenoux, A. Jezierska, J.-C. Pesqet, and H. Talbot,A majorize-minimize subspace approach for ℓ2 −ℓ 0 image regularization, SIAM J. Imaging Sci., 6 (2013), pp. 563–591
work page 2013
-
[18]
E. Chouzenoux, S. Martin, and J. Pesqet,A local MM subspace method for solving constrained variational problems in image recovery, J. Math. Imaging Vision, 65 (2023), pp. 253–276
work page 2023
-
[19]
E. Chouzenoux, S. Moussaoui, and J. Idier,Majorize-minimize linesearch for inversion methods involving barrier function optimization, Inverse Problems, 28 (2012)
work page 2012
-
[20]
E. Chouzenoux, S. Moussaoui, J. Idier, and F. Mariette,Efficient maximum entropy reconstruction of nuclear magnetic resonance T1-T2 spectra, IEEE Trans. Signal Process., 58 (2010), pp. 6040–6051
work page 2010
-
[21]
E. Chouzenoux and J.-C. Pesqet,Optimization Methods for Signal Processing, in Source Separation in Physical-Chemical Sensing, C.Jutten, L. Duarte, and S. Moussaoui, eds., no. Chapter 2, IEEE Press, Piscataway, NJ, USA, Oct. 2023
work page 2023
-
[22]
E. Chouzenoux, J.-C. Pesqet, and A. Repetti,A variable metric forward-backward algorithm for minimizing the sum of a differentiable function and a convex function, J. Optim. Theory Appl., 162 (2014), pp. 107–132
work page 2014
-
[23]
P. Combettes and Q. Nguyen,Solving composite monotone inclusions in reflexive Banach spaces by constructing best Bregman approximations from their Kuhn-Tucker set, J. Convex Anal., 23 (2016), pp. 481–510
work page 2016
-
[24]
P. L. Combettes and B. C. V ˜u,Variable metric forward–backward splitting with applications to monotone inclusions in duality, Optimization, 63 (2014), pp. 1289–1318
work page 2014
-
[25]
P. L. Combettes and V. R. W ajs,Signal recovery by proximal forward-backward splitting, Multiscale Model. Simul., 4 (2005), pp. 1168–1200. 26 V ariable Bregman MM for nonconvex nonsmooth optimization with applications to Poisson imaging
work page 2005
-
[26]
A. R. De Pierro,On the relation between the ISRA and the EM algorithm for positron emission tomography, IEEE Trans. Med. Imag., 12 (1993), pp. 328–333
work page 1993
-
[27]
A. P. Dempster, N. M. Laird, and D. B. Rubin,Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc., B: Stat. Methodol., 39 (1977), pp. 1–22
work page 1977
-
[28]
P. Eggermont,Multiplicative iterative algorithms for convex programming, Linear Algebra Appl., 130 (1990), pp. 25–42
work page 1990
-
[29]
H. Erdogan and J. A. Fessler,Monotonic algorithms for transmission tomography, IEEE Trans. Med. Imag., 18 (1999), pp. 801–814
work page 1999
-
[30]
J. Fessler and H. Erdogan,A paraboloidal surrogates algorithm for convergent penalized-likelihood emission image reconstruction, in Proceedings of the IEEE Nuclear Science Symposium and Medical Imaging Conference Record (NSSMIC 1998), vol. 2, Toronto, Canada, 1998, pp. 1132–1135 vol.2
work page 1998
-
[31]
J. Fessler and A. Hero,Complete-data spaces and generalized EM algorithms, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1993), vol. 4, Minneapolis, MN, USA, 1993, pp. 1–4
work page 1993
-
[32]
J. Fessler and A. Hero,Penalized maximum-likelihood image reconstruction using space-alternating general- ized EM algorithms, IEEE Trans. Image Process., 4 (1995), pp. 1417–1429
work page 1995
-
[33]
C. F ´evotte and J. Idier,Algorithms for nonnegative matrix factorization with the 𝛽-divergence, Neural Comput., 23 (2011), p. 2421–2456
work page 2011
-
[34]
P. Frankel, G. Garrigos, and J. Peypouqet,Splitting methods with variable metric for Kurdyka– Łojasiewicz functions and general convergence rates, J. Optim. Theory Appl., 165 (2015), pp. 874–900
work page 2015
-
[35]
C. F´evotte, N. Bertin, and J.-L. Durrieu,Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis, Neural Comput., 21 (2009), pp. 793–830
work page 2009
-
[36]
K. Gong, J. Guan, K. Kim, X. Zhang, J. Yang, Y. Seo, G. El Fakhri, J. Qi, and Q. Li,Iterative PET image reconstruction using convolutional neural network representation, IEEE Trans. Med. Imag., 38 (2019), pp. 675– 685
work page 2019
-
[37]
L. T. K. Hien, D. N. Phan, and N. Gillis,An inertial block majorization minimization framework for nonsmooth nonconvex optimization, J. Mach. Learn. Res., 24 (2023). [38]D. Hunter and K. Lange,A tutorial on MM algorithms, Amer. Statist., 58 (2004), pp. 30–37
work page 2023
-
[38]
S. Hurault, U. Kamilov, A. Leclaire, and N. Papadakis,Convergent Bregman plug-and-play image restora- tion for Poisson inverse problems, in Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), vol. 36, New Orleans, LA, USA, 2023, pp. 27251–27280
work page 2023
-
[39]
M. W. Jacobson and J. A. Fessler,An expanded theoretical treatment of iteration-dependent majorize-minimize algorithms, IEEE Trans. Imag. Proc., 16 (2007), p. 2411–2422
work page 2007
- [40]
-
[41]
H. Lant´eri, M. Roche, and C. Aime,Penalized maximum likelihood image restoration with positivity con- straints: multiplicative algorithms, Inverse Problems, 18 (2002), p. 1397
work page 2002
-
[42]
D. D. Lee and H. S. Seung,Learning the parts of objects by nonnegative matrix factorization, Nature, 401 (1999), pp. 788–791
work page 1999
-
[43]
B. Li, S. Pan, and T. Zeng,An inexact proximal majorization–minimization method for a class of image reconstruction models, Inverse Problems, 41 (2025), p. 065004
work page 2025
-
[44]
Lucy,An iterative technique for the rectification of observed distributions, Astron
L. Lucy,An iterative technique for the rectification of observed distributions, Astron. J., 79 (1974), pp. 745–754
work page 1974
-
[45]
S. Martin, J.-C. Pesqet, G. Steidl, and I. B. Ayed,Variable Bregman majorization-minimization algorithm and its application to Dirichlet maximum likelihood estimation, J. Appl. Numer. Optim., 7 (2025)
work page 2025
-
[46]
A. Mehranian and A. J. Reader,Model-based deep learning PET image reconstruction using forward–backward splitting expectation–maximization, IEEE Trans. Radiat. Plasma Med. Sci., 5 (2021), pp. 54–64. 27 M.Adly, A.Chazottes, E.Chouzenoux, J.-C.Pesqet& F.Sureau
work page 2021
-
[47]
M. C. Mukkamala,Bregman Proximal Minimization Algorithms, Analysis and Applications, PhD thesis, Eberhard Karls Universit¨at T¨ubingen, 2021
work page 2021
-
[48]
S. Naderi, K. He, R. Aghajani, S. Sclaroff, and P. Felzenszwalb,Generalized majorization-minimization, in Proceedings of the 36th International Conference on Machine Learning (ICML 2019), Long Beach, California, 2019. [50]F. Natterer and F. Wubbeling,Mathematical Methods in Image Reconstruction, SIAM, Philadelphia, 2001
work page 2019
-
[49]
I. Necoara and D. Lupu,A systematic approach to general higher-order majorization-minimization algorithms for (non)convex optimization, SIAM J. on Optimization, 35 (2025), p. 2372–2401
work page 2025
-
[50]
A. Repetti and Y. Wiaux,Variable metric forward-backward algorithm for composite minimization problems, SIAM J. on Optimization, 31 (2021), pp. 1215–1241
work page 2021
-
[51]
Richardson,Bayesian-based iterative method of image restoration, J
W. Richardson,Bayesian-based iterative method of image restoration, J. Opt. Soc. Am., 62 (1972), pp. 55–59. [54]R. T. Rockafellar,Convex Analysis, Princeton University Press, Princeton, NJ, 1970. [55]R. T. Rockafellar and R. J.-B. Wets,Variational Analysis, vol. 317, Springer, Berlin, 1998
work page 1972
-
[52]
C. Rossignol, F. Sureau, E. Chouzenoux, C. Comtat, and J.-C. Pesqet,A Bregman majorization- minimization framework for PET image reconstruction, in Proceedings of the IEEE International Conference on Image Processing (ICIP 2022), Bordeaux, France, Oct. 2022, IEEE, pp. 1736–1740
work page 2022
-
[53]
L. Shepp and Y. V ardi,Maximum likelihood reconstruction for emission tomography, IEEE Trans. Med. Imag., 1 (1982), pp. 113–122
work page 1982
-
[54]
S. Stute, C. Tauber, C. Leroy, M. Bottlaender, V. Brulon, and C. Comtat,Analytical simulations of dynamic PET scans with realistic count rates properties, in Proceedings of the IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC 2015), San Diego, CA, USA, 2015, pp. 1–3
work page 2015
-
[55]
Y. Sun, P. Babu, and D. P. Palomar,Majorization-minimization algorithms in signal processing, communica- tions, and machine learning, IEEE Trans. Sig. Process., 65 (2017), pp. 794–816. [60]M. Teboulle,A simplified view of first order methods for optimization, Math. Program., Series B, 170 (2018), pp. 67–96. [61]G. W ang and J. Qi,Generalized algorithms fo...
work page 2017
-
[56]
G. W ang and J. Qi,Penalized likelihood PET image reconstruction using patch-based edge-preserving regular- ization, IEEE Trans. Med. Imag., 31 (2012), pp. 2194–2204
work page 2012
-
[57]
R. Zanella, P. Boccacci, L. Zanni, and M. Bertero,Efficient gradient projection methods for edge-preserving removal of Poisson noise, Inverse Problems, 25 (2009), p. 045010
work page 2009
- [58]
-
[59]
G. Zhou, D. Tward, and K. Lange,A majorization-minimization algorithm for neuroimage registration, SIAM J. Imaging Sci., 17 (2024), pp. 273–300. 28 V ariable Bregman MM for nonconvex nonsmooth optimization with applications to Poisson imaging A Proof of proposition 5.3 LetD=(0,+∞) 𝑁 and𝑧∈ D. (Maj-5)For every𝑥∈ D, (𝑥−𝑧) ⊤𝐶ℎ1,𝑧 (𝑥−𝑧) ≤ 𝑁∑︁ 𝑛=1 𝑎1,𝑛(𝑧) ∫ 1 0...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.