A DC Composite Optimization via Variable Smoothing for Robust Phase Retrieval with Nonconvex Loss Functions
Pith reviewed 2026-05-10 18:21 UTC · model grok-4.3
The pith
A variable smoothing algorithm minimizes DC composite objectives for robust phase retrieval using general nonconvex loss functions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors propose viewing the robust phase retrieval objective with a DC loss as a DC-composite function and develop a variable smoothing algorithm that generates smooth surrogates using Moreau envelopes of the convex and concave parts, then applies gradient descent without any inner loop. They prove that the sequence of iterates converges to a DC composite critical point. Experiments confirm that DC losses provide greater robustness to outliers than the ℓ1 loss.
What carries the argument
The variable smoothing algorithm that builds differentiable surrogates from Moreau envelopes of each weakly convex part in the DC loss, then takes a gradient step on the resulting smooth composite function.
If this is right
- The algorithm converges to a critical point without needing nested optimization loops.
- DC loss functions can replace ℓ1 for improved outlier handling in phase retrieval.
- The method applies to any DC-composite optimization problem with a smooth mapping.
- Absence of an inner loop reduces per-iteration cost for large-scale quadratic measurement problems.
Where Pith is reading between the lines
- The same smoothing construction could apply directly to other nonconvex composite problems such as robust matrix completion or blind deconvolution.
- One could test whether particular choices of DC loss, such as those based on Huber or Tukey functions, give still better recovery on data with structured outliers.
- The convergence result might extend to stochastic or online settings if the Moreau envelopes are replaced by suitable stochastic approximations.
Load-bearing premise
The Moreau-envelope surrogates stay close enough to the original DC objective that gradient steps on them reliably approach a useful critical point for the phase retrieval task.
What would settle it
An experiment on synthetic phase retrieval data with known outliers where the DC method fails to recover the signal more accurately than ℓ1 methods, or a case where the generated sequence does not approach a DC composite critical point.
Figures
read the original abstract
In this paper, we propose an optimization-based method for robust phase retrieval problem where the goal is to estimate an unknown signal from a quadratic measurement corrupted by outliers. To enhance the robustness of existing optimization models with the $\ell_1$ loss function, we propose a generalized model that can handle DC (Difference-of-Convex) loss functions beyond the $\ell_1$ loss. We view the cost function of the proposed model as a composition of a DC function with a smooth mapping, and develop a variable smoothing algorithm for minimizing such DC composite functions. At each step of our algorithm, we generate a smooth surrogate function by using the Moreau envelope of each (weakly) convex function in the DC function, and then perform the gradient descent update of the surrogate function. Unlike many existing algorithms for DC problems, the proposed algorithm does not require any inner loop. We also present a convergence analysis in terms of a DC composite critical point for the proposed algorithm. Our numerical experiment demonstrates that the proposed method with DC loss functions is more robust against outliers compared to existing methods with the $\ell_1$ loss.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a variable-smoothing algorithm for DC-composite optimization in robust phase retrieval, where each convex piece of a DC loss is replaced by its Moreau envelope to form a smooth surrogate; a single gradient step is taken on the surrogate at each iteration without inner loops. Convergence is proved to a DC-composite critical point of the surrogate sequence, and numerical experiments are presented claiming that DC losses yield greater outlier robustness than standard ℓ1-loss formulations.
Significance. If the gap between surrogate and original critical points can be closed with a quantitative error bound, the inner-loop-free variable-smoothing scheme would be a useful algorithmic contribution for nonconvex robust estimation problems in signal processing. The avoidance of inner loops is a concrete practical advantage over many existing DC methods, and the extension beyond ℓ1 losses is a natural direction for robustness.
major comments (2)
- [§4 (Convergence Analysis), Theorem 4.1] §4 (Convergence Analysis), Theorem 4.1: the stated convergence is only to a DC-composite critical point of the sequence of Moreau-smoothed surrogates; no quantitative bound is supplied relating the envelope approximation error (e.g., distance between the surrogate subdifferential and the original DC subdifferential) to the smoothing schedule μ_k or to the distance of the limit point to the true critical set of the unsmoothed DC composite objective. This is load-bearing for both the theoretical claim and the robustness assertion, because the phase-retrieval mapping is nonconvex and outliers can make the surrogate and original critical sets materially different.
- [§5 (Numerical Experiments)] §5 (Numerical Experiments): the claim that DC-loss versions are “more robust against outliers” is supported only by recovery-error tables; the manuscript does not specify the exact DC loss functions used, the precise outlier model (fraction, magnitude distribution), the number of Monte-Carlo trials, or statistical significance tests, so it is impossible to determine whether the reported improvement is attributable to the DC formulation or to implementation details.
minor comments (2)
- [§3 (Algorithm)] The variable smoothing schedule (how μ_k is chosen or decreased) is described only qualitatively; an explicit formula or pseudocode line would improve reproducibility.
- [§2] Notation for the DC decomposition (f = g − h) and the composite mapping could be aligned more clearly with the phase-retrieval quadratic measurements introduced in §2.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and indicate planned revisions.
read point-by-point responses
-
Referee: §4 (Convergence Analysis), Theorem 4.1: the stated convergence is only to a DC-composite critical point of the sequence of Moreau-smoothed surrogates; no quantitative bound is supplied relating the envelope approximation error (e.g., distance between the surrogate subdifferential and the original DC subdifferential) to the smoothing schedule μ_k or to the distance of the limit point to the true critical set of the unsmoothed DC composite objective. This is load-bearing for both the theoretical claim and the robustness assertion, because the phase-retrieval mapping is nonconvex and outliers can make the surrogate and original critical sets materially different.
Authors: We acknowledge that Theorem 4.1 establishes convergence of the iterates to a DC-composite critical point of the smoothed surrogate sequence rather than directly to the original unsmoothed DC objective. The variable-smoothing framework with μ_k → 0 is constructed precisely to approximate the target problem, yet we agree that an explicit quantitative error bound relating surrogate subdifferentials to the original DC subdifferential would strengthen the result. Deriving such a bound in full generality appears to require additional assumptions on the DC loss and the quadratic measurement map that are not imposed in the current analysis. In the revision we will add a clarifying remark in Section 4 that discusses the role of the smoothing schedule and the sense in which the surrogate critical points approximate those of the original problem. The inner-loop-free convergence guarantee for the surrogates remains a concrete algorithmic contribution. revision: partial
-
Referee: §5 (Numerical Experiments): the claim that DC-loss versions are “more robust against outliers” is supported only by recovery-error tables; the manuscript does not specify the exact DC loss functions used, the precise outlier model (fraction, magnitude distribution), the number of Monte-Carlo trials, or statistical significance tests, so it is impossible to determine whether the reported improvement is attributable to the DC formulation or to implementation details.
Authors: The referee correctly identifies that the experimental section omits several necessary details. In the revised manuscript we will expand Section 5 to state the precise DC loss functions employed (including their explicit mathematical expressions), the outlier model (fraction of corrupted measurements and the distribution from which outlier magnitudes are drawn), the exact number of Monte-Carlo trials performed for each setting, and the results of statistical significance tests (e.g., paired t-tests or Wilcoxon rank-sum tests) comparing the DC-loss and ℓ1-loss recoveries. These additions will make the robustness comparison reproducible and will clarify that the observed gains are attributable to the choice of loss. revision: yes
- A quantitative error bound relating the distance between surrogate critical points and the original DC critical set without imposing further restrictive assumptions on the DC functions or the measurement model.
Circularity Check
No circularity: algorithmic derivation is self-contained
full rationale
The paper develops a variable-smoothing algorithm for DC-composite phase-retrieval objectives by replacing each convex piece with its Moreau envelope and performing gradient steps on the resulting smooth surrogate; convergence is established to a DC-composite critical point of the surrogate sequence. All steps rely on standard convex-analysis identities (Moreau envelope properties, subdifferential calculus) that are independent of the target result and are not obtained by fitting parameters to the paper's own data or by self-referential definitions. No load-bearing claim reduces by the paper's equations to a fitted input renamed as a prediction, nor does any uniqueness or ansatz rest on a self-citation chain. The robustness comparison is supported by numerical experiments rather than by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Moreau envelope of a weakly convex function yields a smooth surrogate whose gradient is well-defined
Reference graph
Works this paper leans on
-
[1]
Variable smoothing algorithm for inner-loop-free DC composite optimizations,
K. Yazawa, K. Kume, and I. Yamada, “Variable smoothing algorithm for inner-loop-free DC composite optimizations,” in EUSIPCO, 2025
work page 2025
-
[2]
Robust phaselift for phase retrieval under corruptions,
P. Hand and T. Huynh, “Robust phaselift for phase retrieval under corruptions,” in 50th Asilomar Conference on Signals, Systems and Computers. IEEE, 2016, pp. 1039–1042
work page 2016
-
[3]
Corruption robust phase retrieval via linear programming,
P. Hand and V . V oroninski, “Corruption robust phase retrieval via linear programming,” arXiv (1612.03547v1), 2016
-
[4]
Median-truncated nonconvex approach for phase retrieval with outliers,
H. Zhang, Y . Chi, and Y . Liang, “Median-truncated nonconvex approach for phase retrieval with outliers,” IEEE Transactions on Information Theory, vol. 64, no. 11, pp. 7287–7310, 2018
work page 2018
-
[5]
Z. Zheng, S. Ma, and L. Xue, “A new inexact proximal linear algo- rithm with adaptive stopping criteria for robust phase retrieval,” IEEE Transactions on Signal Processing , vol. 72, pp. 1081–1093, 2024
work page 2024
-
[6]
Robust phase retrieval by alternating minimization,
S. Kim and K. Lee, “Robust phase retrieval by alternating minimization,” IEEE Transactions on Signal Processing , vol. 73, pp. 40–54, 2025
work page 2025
-
[7]
Phase retrieval in crystallography and optics,
R. P. Millane, “Phase retrieval in crystallography and optics,” Journal of the Optical Society of America A , vol. 7, no. 3, pp. 394–411, 1990
work page 1990
-
[8]
The phase problem of X-ray crystallography,
H. A. Hauptman, “The phase problem of X-ray crystallography,” Reports on Progress in Physics , vol. 54, no. 11, pp. 1427–1454, 1991
work page 1991
-
[9]
The question of phase retrieval in optics,
A. Walther, “The question of phase retrieval in optics,” Optica Acta: International Journal of Optics , vol. 10, no. 1, pp. 41–49, 1963
work page 1963
-
[10]
Phase retrieval with application to optical imaging: a contemporary overview,
Y . Shechtman, Y . C. Eldar, O. Cohen, H. N. Chapman, J. Miao, and M. Segev, “Phase retrieval with application to optical imaging: a contemporary overview,” IEEE Signal Processing Magazine , vol. 32, no. 3, pp. 87–109, 2015
work page 2015
-
[11]
Phase retrieval and image reconstruction for astronomy,
C. Fienup and J. Dainty, “Phase retrieval and image reconstruction for astronomy,” Image Recovery: Theory and Application , vol. 231, p. 275, 1987
work page 1987
-
[12]
Undersampled phase retrieval with outliers,
D. S. Weller, A. Pnueli, G. Divon, O. Radzyner, Y . C. Eldar, and J. A. Fessler, “Undersampled phase retrieval with outliers,”IEEE Transactions on Computational Imaging , vol. 1, no. 4, pp. 247–258, 2015
work page 2015
-
[13]
A practical algorithm for the determination of phase from image and diffraction plane picture,
R. Gerhberg and W. Saxton, “A practical algorithm for the determination of phase from image and diffraction plane picture,” Optik, vol. 35, pp. 237–246, 1972
work page 1972
-
[14]
Reconstruction of an object from the modulus of its Fourier transform,
J. R. Fienup, “Reconstruction of an object from the modulus of its Fourier transform,” Optics Letters, vol. 3, no. 1, pp. 27–29, 1978
work page 1978
-
[15]
Phase retrieval algorithms: a comparison,
——, “Phase retrieval algorithms: a comparison,” Applied Optics , vol. 21, no. 15, pp. 2758–2769, 1982
work page 1982
-
[16]
Phase retrieval via matrix completion,
E. J. Candes, Y . C. Eldar, T. Strohmer, and V . V oroninski, “Phase retrieval via matrix completion,” SIAM review, vol. 57, no. 2, pp. 225– 251, 2015
work page 2015
-
[17]
Phase retrieval via Wirtinger flow: Theory and algorithms,
E. J. Candes, X. Li, and M. Soltanolkotabi, “Phase retrieval via Wirtinger flow: Theory and algorithms,” IEEE Transactions on Information The- ory, vol. 61, no. 4, pp. 1985–2007, 2015
work page 1985
-
[18]
Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval,
J. C. Duchi and F. Ruan, “Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval,” Information and Inference: A Journal of the IMA , vol. 8, no. 3, pp. 471–529, 2019
work page 2019
-
[19]
P. Bloomfield and W. L. Steiger, Least absolute deviations: theory, applications, and algorithms . Springer, 1983, vol. 6
work page 1983
-
[20]
M. Yukawa, H. Kaneko, K. Suzuki, and I. Yamada, “Linearly-involved Moreau-enhanced-over-subspace model: Debiased sparse modeling and stable outlier-robust regression,” IEEE Transactions on Signal Process- ing, vol. 71, pp. 1232–1247, 2023
work page 2023
-
[21]
Scalable robust matrix factorization with non- convex loss,
Q. Yao and J. Kwok, “Scalable robust matrix factorization with non- convex loss,” in Advances in Neural Information Processing Systems , vol. 31, 2018
work page 2018
-
[22]
Robust low-rank tensor recovery with regularized redescending M-estimator,
Y . Yang, Y . Feng, and J. A. Suykens, “Robust low-rank tensor recovery with regularized redescending M-estimator,” IEEE Transactions on 9 n=d5 10 15 20 pfail 0.1 0.2 0.3 0.4 0.5 0.6 0 0.2 0.4 0.6 0.8 1 (a) ℓ1 by IPL n=d5 10 15 20 pfail 0.1 0.2 0.3 0.4 0.5 0.6 0 0.2 0.4 0.6 0.8 1 (b) Capped ℓ1 (β = 100) n=d5 10 15 20 pfail 0.1 0.2 0.3 0.4 0.5 0.6 0 0.2 0....
work page 1933
-
[23]
Nearly unbiased variable selection under minimax con- cave penalty,
C.-H. Zhang, “Nearly unbiased variable selection under minimax con- cave penalty,” The Annals of Statistics , pp. 894–942, 2010
work page 2010
-
[24]
Multi-stage convex relaxation for learning with sparse regularization,
T. Zhang, “Multi-stage convex relaxation for learning with sparse regularization,” in Advances in Neural Information Processing Systems , vol. 21, 2008
work page 2008
-
[25]
Rank-based estimates in the linear model with high break- down point,
O. H ¨ossjer, “Rank-based estimates in the linear model with high break- down point,” Journal of the American Statistical Association , vol. 89, no. 425, pp. 149–158, 1994
work page 1994
-
[26]
Robust principal component analysis via capped norms,
Q. Sun, S. Xiang, and J. Ye, “Robust principal component analysis via capped norms,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining , 2013
work page 2013
-
[27]
CVaR (superquantile) norm: Stochastic case,
A. Mafusalov and S. Uryasev, “CVaR (superquantile) norm: Stochastic case,” European Journal of Operational Research , vol. 249, no. 1, pp. 200–208, 2016
work page 2016
-
[28]
DC formulations and algorithms for sparse optimization problems,
J.-y. Gotoh, A. Takeda, and K. Tono, “DC formulations and algorithms for sparse optimization problems,” Mathematical Programming , vol. 169, no. 1, pp. 141–176, 2018
work page 2018
-
[29]
Exact penalization at d-stationary points of cardinality-or rank-constrained problem,
S. Yagishita and J.-y. Gotoh, “Exact penalization at d-stationary points of cardinality-or rank-constrained problem,” Optimization, pp. 1–35, 2025
work page 2025
-
[30]
Applications and algorithms for least trimmed sum of absolute deviations regression,
D. M. Hawkins and D. Olive, “Applications and algorithms for least trimmed sum of absolute deviations regression,” Computational Statis- tics & Data Analysis , vol. 32, no. 2, pp. 119–134, 1999
work page 1999
-
[31]
K. Kume and I. Yamada, “Minimization of nonsmooth weakly con- vex function over prox-regular set for robust low-rank matrix recov- ery,” arXiv (2509.17549v2) , 2025, (Accepted for presentation at IEEE ICASSP 2026)
-
[32]
Beck, First-order methods in optimization
A. Beck, First-order methods in optimization . SIAM, 2017
work page 2017
-
[33]
The proximity operator repository,
G. Chierchia, E. Chouzenoux, P. L. Combettes, and J.-C. Pesquet, “The proximity operator repository,” http://proximity-operator.net/
-
[34]
SLOPE—adaptive variable selection via convex optimization,
M. Bogdan, E. Van Den Berg, C. Sabatti, W. Su, and E. J. Cand `es, “SLOPE—adaptive variable selection via convex optimization,” The Annals of Applied Statistics , vol. 9, no. 3, p. 1103, 2015
work page 2015
-
[35]
Cauchy loss function: Robustness under Gaussian and Cauchy noise,
T. Mlotshwa, H. van Deventer, and A. S. Bosman, “Cauchy loss function: Robustness under Gaussian and Cauchy noise,” in Southern African Conference for Artificial Intelligence Research . Springer, 2022, pp. 123–138
work page 2022
-
[36]
Enhancing sparsity by reweighted ℓ1 minimization,
E. J. Candes, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by reweighted ℓ1 minimization,” Journal of Fourier Analysis and Applica- tions, vol. 14, no. 5, pp. 877–905, 2008
work page 2008
-
[37]
A nonconvex model with minimax concave penalty for image restoration,
J. You, Y . Jiao, X. Lu, and T. Zeng, “A nonconvex model with minimax concave penalty for image restoration,” Journal of Scientific Computing, vol. 78, no. 2, pp. 1063–1086, 2019
work page 2019
-
[38]
Two-level ℓ1 minimization for compressed sensing,
X. Huang, Y . Liu, L. Shi, S. Van Huffel, and J. A. Suykens, “Two-level ℓ1 minimization for compressed sensing,” Signal Processing, vol. 108, pp. 459–475, 2015
work page 2015
-
[39]
Minimizing compositions of differences-of-convex functions with smooth mappings,
H. A. Le Thi, V . N. Huynh, and T. P. Dinh, “Minimizing compositions of differences-of-convex functions with smooth mappings,”Mathematics of Operations Research , vol. 49, no. 2, pp. 1140–1168, 2024
work page 2024
-
[40]
Variable smoothing for weakly convex composite functions,
A. B ¨ohm and S. J. Wright, “Variable smoothing for weakly convex composite functions,” Journal of Optimization Theory and Applications , vol. 188, pp. 628–649, 2021
work page 2021
-
[41]
K. Kume and I. Yamada, “A variable smoothing for nonconvexly 10 constrained nonsmooth optimization with application to sparse spectral clustering,” in IEEE ICASSP, 2024
work page 2024
-
[42]
Algorithms for difference-of-convex programs based on difference-of-Moreau-envelopes smoothing,
K. Sun and X. A. Sun, “Algorithms for difference-of-convex programs based on difference-of-Moreau-envelopes smoothing,” INFORMS Jour- nal on Optimization , vol. 5, no. 4, pp. 321–339, 2023
work page 2023
-
[43]
R. T. Rockafellar and R. J.-B. Wets, Variational analysis. Springer Science & Business Media, 2009, vol. 317
work page 2009
-
[44]
J. Li, A. M.-C. So, and W.-K. Ma, “Understanding notions of stationarity in nonsmooth optimization: A guided tour of various constructions of subdifferential for nonsmooth functions,” IEEE Signal Processing Magazine, vol. 37, no. 5, pp. 18–31, 2020
work page 2020
-
[45]
K. Kume and I. Yamada, “A variable smoothing for weakly convex composite minimization with manifold constraint via parametrization,” arXiv (2412.04225v4), 2026
-
[46]
Open issues and recent advances in DC programming and DCA,
H. A. Le Thi and T. Pham Dinh, “Open issues and recent advances in DC programming and DCA,” Journal of Global Optimization , vol. 88, no. 3, pp. 533–590, 2024
work page 2024
-
[47]
An inexact proximal linearized DC algorithm with provably terminating inner loop,
Y . Zhang and I. Yamada, “An inexact proximal linearized DC algorithm with provably terminating inner loop,” Optimization, pp. 1–33, 2024
work page 2024
-
[48]
A regularization interpreta- tion of the proximal point method for weakly convex functions,
T. Hoheisel, M. Laborde, and A. Oberman, “A regularization interpreta- tion of the proximal point method for weakly convex functions,” Journal of Dynamics & Games , vol. 7, no. 1, pp. 79–96, 2020
work page 2020
-
[49]
Andrei, Nonlinear conjugate gradient methods for unconstrained optimization
N. Andrei, Nonlinear conjugate gradient methods for unconstrained optimization. Springer, 2020
work page 2020
-
[50]
T. M. Apostol, Mathematical analysis second edition. Addison-Wesley, 1974
work page 1974
-
[51]
K. Kume and I. Yamada, “A proximal variable smoothing for minimiza- tion of nonlinearly composite nonsmooth function–maxmin dispersion and mimo applications,” arXiv (2506.05974v1), 2025
-
[52]
Efficiency of minimizing composi- tions of convex functions and smooth maps,
D. Drusvyatskiy and C. Paquette, “Efficiency of minimizing composi- tions of convex functions and smooth maps,” Mathematical Program- ming, vol. 178, no. 1, pp. 503–558, 2019
work page 2019
-
[53]
J. A. Dieudonn ´e, Foundations of modern analysis . Academic Press, 1960. APPENDIX A PROOF OF LEMMA II.4 We show that a local minimizer x⋆ of F is a DC composite critical point of F . Proof of Lemma II.4. It suffices to prove the Claim 1: ∂L(g ◦ S)(x⋆) ̸= ∅ and the Claim 2: ∂L(f ◦S)(x⋆) ⊃ ∂L(g ◦S)(x⋆) because these two claims imply that ∂L(f ◦ S)(x⋆) − ∂L...
work page 1960
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.