Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing

Cynthia Rush; Jason Klusowski; Weijie Su; Zhiqi Bu

arxiv: 1907.07502 · v1 · pith:EGCE7R2Fnew · submitted 2019-07-17 · 📊 stat.ML · cs.LG· eess.SP· math.ST· stat.TH

Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing

Zhiqi Bu , Jason Klusowski , Cynthia Rush , Weijie Su This is my paper

Pith reviewed 2026-05-24 20:05 UTC · model grok-4.3

classification 📊 stat.ML cs.LGeess.SPmath.STstat.TH

keywords SLOPEapproximate message passingsorted l1 penaltyhigh-dimensional regressionstate evolutionasymptotic analysisconvex optimization

0 comments

The pith

Approximate message passing yields an asymptotically exact characterization of the SLOPE solution under Gaussian random designs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

SLOPE applies a sorted l1 penalty whose size depends on coefficient rank, which defeats many standard tools for analyzing high-dimensional estimators. The authors solve the SLOPE problem with approximate message passing and invoke a state evolution theory that handles non-separable penalties to track the iterates exactly in the large-system limit. They prove the iterates converge to the true SLOPE solution and show numerically that this happens quickly. The result supplies both an efficient approximation method and sharp asymptotic formulas for the estimator's behavior.

Core claim

By applying approximate message passing to the SLOPE objective and using state evolution for non-separable penalties, the AMP iterates converge asymptotically to the SLOPE solution under Gaussian designs, thereby delivering an exact large-system characterization of the estimator along with a practical algorithmic surrogate.

What carries the argument

Approximate message passing iterates whose dynamics are governed by state evolution analysis applied to the sorted l1 penalty.

If this is right

The asymptotic risk and other performance metrics of SLOPE follow from the fixed point of the state evolution recursion.
The SLOPE solution can be approximated by running a modest number of cheap AMP iterations instead of solving the full convex program.
The method supplies both statistical analysis and a constructive algorithm for the same estimator.
Numerical evidence indicates that only a few AMP iterations already produce close agreement with the exact SLOPE solution.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same AMP-plus-state-evolution route may yield characterizations for other regression estimators that use non-separable or rank-dependent penalties.
One could check whether the observed fast convergence persists when the design matrix deviates from Gaussian entries.
The algorithmic equivalence suggests that message-passing methods could serve as a general device for deriving exact asymptotics in high-dimensional convex problems.
Extensions to generalized linear models or other loss functions would test how far the non-separable state evolution technique reaches.

Load-bearing premise

The state evolution analysis for non-separable penalties applies directly to the sorted l1 penalty, and the SLOPE-specific argument establishes that the AMP iterates match the SLOPE solution in the large limit.

What would settle it

A calculation or simulation in which the AMP iterates remain bounded away from the true SLOPE solution as dimensions grow large under Gaussian designs would refute the claimed asymptotic convergence.

Figures

Figures reproduced from arXiv: 1907.07502 by Cynthia Rush, Jason Klusowski, Weijie Su, Zhiqi Bu.

**Figure 1.** Figure 1: Optimization errors, ||β t − βb||2/p, and (symmetric) set difference of supp(β t ) and supp(βb). Optimization errors Set Diff 10−2 10−3 10−4 10−5 10−6 ISTA 60 4048 7326 8569 9007 9161 FISTA 47 275 374 412 593 604 AMP 30 6 13 22 32 40 [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Amin (black curve) when p = 2 and δ = 0.6. and in Algorithm 1 we show that determining the calibration is straightforward in practice. Proposition 2.6. The function α 7→ λ(α) defined in (2.10) is continuous on {α : f(α) < δ} for f(·) defined in (2.7) with λ(Amin) = −∞ and limα→∞ λ(α) = ∞ (where the limit is taken elementwise). Therefore the function λ 7→ α(λ) satisfying (2.10) exists. As p → ∞, the functio… view at source ↗

**Figure 3.** Figure 3: The blue area contained by the black line segment is the set of subgradients; Red crosses [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

**Figure 4.** Figure 4: Left: c2 = 0.5; Right: c2 = 0.2; Blue area is {ν ∈ ∂Jλ(0, 0) : |ν| (1 − c2)P(λ1, λ2)} and grey area is complement of blue area in ∂Jλ(0, 0). We know from the proof of Lemma 7.2 Eq. (7.4) that ν t = µt(X>z t−1 + β t−1 − β t ) ∈ µtJθt (β t ) where µt := hλ, θt−1i/kθt−1k 2 and λ = µtθ t−1 . Therefore, summing over all equivalence classes I, |st(c2)| = X I I{|ν t I | P([Πˆ −1 βt (λ)]I )(1 − c2)} = X I I n … view at source ↗

read the original abstract

SLOPE is a relatively new convex optimization procedure for high-dimensional linear regression via the sorted l1 penalty: the larger the rank of the fitted coefficient, the larger the penalty. This non-separable penalty renders many existing techniques invalid or inconclusive in analyzing the SLOPE solution. In this paper, we develop an asymptotically exact characterization of the SLOPE solution under Gaussian random designs through solving the SLOPE problem using approximate message passing (AMP). This algorithmic approach allows us to approximate the SLOPE solution via the much more amenable AMP iterates. Explicitly, we characterize the asymptotic dynamics of the AMP iterates relying on a recently developed state evolution analysis for non-separable penalties, thereby overcoming the difficulty caused by the sorted l1 penalty. Moreover, we prove that the AMP iterates converge to the SLOPE solution in an asymptotic sense, and numerical simulations show that the convergence is surprisingly fast. Our proof rests on a novel technique that specifically leverages the SLOPE problem. In contrast to prior literature, our work not only yields an asymptotically sharp analysis but also offers an algorithmic, flexible, and constructive approach to understanding the SLOPE problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives the first AMP-based asymptotic characterization of SLOPE by adapting state evolution to the sorted l1 penalty and adding a custom convergence argument.

read the letter

The core advance is an AMP analysis that works for the non-separable sorted l1 penalty in SLOPE. Earlier AMP results were limited to separable penalties, so this extends the framework to a case that had been out of reach. They characterize the asymptotic behavior of the iterates via the recent non-separable state evolution and then prove that those iterates converge to the actual SLOPE solution under Gaussian designs, using a proof step that exploits the structure of the SLOPE problem itself. Simulations are reported to show the approximation is accurate and converges quickly, which supports the practical side of the claim. The approach is constructive: it turns the hard optimization into a sequence of easier AMP steps that can be tracked explicitly. The citation pattern is straightforward and builds on the prior state-evolution literature without circularity. The main point to check is whether the general regularity conditions for the non-separable state evolution (Lipschitz proximal map, controlled subgradient growth) hold for the sorted penalty or whether the SLOPE-specific argument fully closes any gaps. The abstract indicates they address this, but the details matter for how tight the error bounds end up. This is aimed at researchers working on high-dimensional penalized regression or AMP methods. A reader who needs either the statistical characterization or an algorithmic way to approximate SLOPE will find concrete value. It deserves a serious referee because the result is new, the method is explicit, and the simulations give an initial check on the claims.

Referee Report

1 major / 0 minor

Summary. The paper claims to develop an asymptotically exact characterization of the SLOPE solution (sorted ℓ1 penalized regression) under Gaussian random designs by solving the problem via approximate message passing (AMP). It characterizes the asymptotic dynamics of the AMP iterates using a recently developed state evolution framework for non-separable penalties, proves that the AMP iterates converge asymptotically to the SLOPE solution via a novel SLOPE-specific argument, and reports that numerical simulations show surprisingly fast convergence. The approach is positioned as both analytically sharp and algorithmically constructive.

Significance. If the central claims hold, the work supplies both a sharp asymptotic analysis of SLOPE and a practical algorithmic route to approximating its solution, which is useful for high-dimensional regression where the non-separable sorted-ℓ1 penalty has resisted standard techniques. The combination of state-evolution dynamics with a problem-specific convergence proof and fast empirical convergence constitutes a concrete strength.

major comments (1)

[Abstract] Abstract (paragraph on proof technique): the claim that the recently developed state evolution analysis for non-separable penalties applies directly to the sorted ℓ1 penalty (sum λ_i |β|_{(i)}) rests on an unverified transfer of regularity conditions (e.g., Lipschitz continuity of the proximal mapping or controlled subgradient growth). The novel SLOPE-specific convergence argument is invoked to overcome the difficulty, but without explicit confirmation that the base assumptions hold for this ordered, non-smooth penalty, the asymptotic characterization of the AMP dynamics and the convergence statement remain conditional.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on the manuscript. We address the single major comment below regarding the abstract's discussion of the proof technique and state evolution assumptions.

read point-by-point responses

Referee: [Abstract] Abstract (paragraph on proof technique): the claim that the recently developed state evolution analysis for non-separable penalties applies directly to the sorted ℓ1 penalty (sum λ_i |β|_{(i)}) rests on an unverified transfer of regularity conditions (e.g., Lipschitz continuity of the proximal mapping or controlled subgradient growth). The novel SLOPE-specific convergence argument is invoked to overcome the difficulty, but without explicit confirmation that the base assumptions hold for this ordered, non-smooth penalty, the asymptotic characterization of the AMP dynamics and the convergence statement remain conditional.

Authors: We agree that the abstract would benefit from an explicit statement confirming that the regularity conditions of the referenced state-evolution framework hold for the sorted-ℓ1 proximal operator. The proximal mapping of the sorted ℓ1 penalty is Lipschitz continuous (as it reduces to a finite number of soft-thresholding operations after a permutation that is fixed for any given vector) and the subgradient is bounded by max λ_i, satisfying the controlled-growth requirement. These properties follow directly from the definition of the sorted-ℓ1 penalty and are consistent with the general conditions stated in the cited non-separable state-evolution paper. The novel SLOPE-specific convergence argument then closes the remaining gap arising from non-separability. In the revised version we will add one sentence to the abstract making this verification explicit and will include a short appendix paragraph (or reference to the appropriate section) that records the verification for the SLOPE proximal operator. This change improves clarity without altering any theorems or proofs. revision: yes

Circularity Check

0 steps flagged

Minor self-citation to state evolution framework; central derivation independent

full rationale

The paper develops an asymptotically exact characterization of the SLOPE solution by applying AMP iterates and proving their convergence via a novel SLOPE-specific technique. It relies on a recently developed state evolution analysis for non-separable penalties as an external tool to characterize the dynamics, with the novel technique overcoming the sorted l1 penalty difficulty. No quoted step shows a prediction reducing to a fitted input by construction, self-definition of quantities, or a load-bearing self-citation chain that collapses the result to unverified inputs. The derivation remains self-contained against the external framework and simulations, warranting only a minor score for possible author overlap in the cited state evolution work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The analysis rests on the Gaussian random design assumption and the applicability of state evolution to non-separable penalties; no free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption Design matrix entries are i.i.d. Gaussian
Invoked for the asymptotic analysis under random designs in high dimensions.
domain assumption State evolution analysis extends to non-separable penalties
Required to characterize the dynamics of AMP iterates for the sorted l1 penalty.

pith-pipeline@v0.9.0 · 5742 in / 1190 out tokens · 24226 ms · 2026-05-24T20:05:47.879642+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · 1 internal anchor

[1]

Bai and Y

Z. Bai and Y. Yin. Limit of the smallest eigenvalue of a large dimensional sample covariance matrix. In Advances In Statistics, pages 108–127. World Scientiﬁc, 2008

work page 2008
[2]

R. F. Barber and E. J. Candès. Controlling the false discovery rate via knockoﬀs.The Annals of Statistics, 43(5):2055–2085, 2015

work page 2055
[3]

Bayati, M

M. Bayati, M. A. Erdogdu, and A. Montanari. Estimating lasso risk and noise level. InAdvances in Neural Information Processing Systems, pages 944–952, 2013

work page 2013
[4]

Bayati and A

M. Bayati and A. Montanari. The dynamics of message passing on dense graphs, with applications to compressed sensing. IEEE Trans. on Inf. Theory, 57(2):764–785, 2011

work page 2011
[5]

Bayati and A

M. Bayati and A. Montanari. The lasso risk for gaussian matrices.IEEE Transactions on Information Theory, 58(4):1997–2017, 2011

work page 1997
[6]

Beck and M

A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1):183–202, 2009

work page 2009
[7]

P. C. Bellec, G. Lecué, and A. B. Tsybakov. SLOPE meets lasso: improved oracle bounds and optimality. The Annals of Statistics, 46(6B):3603–3642, 2018

work page 2018
[8]

State Evolution for Approximate Message Passing with Non-Separable Functions

R. Berthier, A. Montanari, and P.-M. Nguyen. State evolution for approximate message passing with non-separable functions. arXiv preprint arXiv:1708.03950, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[9]

Bogdan, E

M. Bogdan, E. Van Den Berg, C. Sabatti, W. Su, and E. J. Candès. SLOPE—adaptive variable selection via convex optimization.The Annals of Applied Statistics, 9(3):1103, 2015

work page 2015
[10]

H. D. Bondell and B. J. Reich. Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with oscar.Biometrics, 64(1):115–123, 2008

work page 2008
[11]

Brzyski, A

D. Brzyski, A. Gossmann, W. Su, and M. Bogdan. Group SLOPE—adaptive selection of groups of predictors. Journal of the American Statistical Association, pages 1–15, 2018

work page 2018
[12]

Celentano and A

M. Celentano and A. Montanari. Fundamental barriers to high-dimensional regression with convex penalties. arXiv preprint arXiv:1903.10603, 2019

work page arXiv 1903
[13]

Chambolle, R

A. Chambolle, R. A. De Vore, N.-Y. Lee, and B. J. Lucier. Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage.IEEE Transactions on Image Processing, 7(3):319–335, 1998

work page 1998
[14]

Daubechies, M

I. Daubechies, M. Defrise, and C. De Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint.Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences, 57(11):1413–1457, 2004

work page 2004
[15]

Donoho and A

D. Donoho and A. Montanari. High dimensional robust m-estimation: Asymptotic variance via approximate message passing.Probability Theory and Related Fields, 166(3-4):935–969, 2016. 28

work page 2016
[16]

D. L. Donoho, A. Maleki, and A. Montanari. Message-passing algorithms for compressed sensing. Proceedings of the National Academy of Sciences, 106(45):18914–18919, 2009

work page 2009
[17]

D. L. Donoho, A. Maleki, and A. Montanari. The noise-sensitivity phase transition in compressed sensing. IEEE Transactions on Information Theory, 57(10):6920–6941, 2011

work page 2011
[18]

J. L. Doob.Stochastic processes, volume 101. New York Wiley, 1953

work page 1953
[19]

Figueiredo and R

M. Figueiredo and R. Nowak. Ordered weighted l1 regularized regression with strongly correlated covariates: Theoretical aspects. InArtiﬁcial Intelligence and Statistics, pages 930–938, 2016

work page 2016
[20]

Hu and Y

H. Hu and Y. M. Lu. Asymptotics and optimal designs of SLOPE for sparse linear regression.arXiv preprint arXiv:1903.11582, 2019

work page arXiv 1903
[21]

Javanmard and A

A. Javanmard and A. Montanari. State evolution for general approximate message passing algorithms, with applications to spatial coupling.Information and Inference: A Journal of the IMA, 2(2):115–144, 2013

work page 2013
[22]

B. S. Kashin. Diameters of some ﬁnite-dimensional sets and classes of smooth functions.Izvestiya Rossiiskoi Akademii Nauk. Seriya Matematicheskaya, 41(2):334–351, 1977

work page 1977
[23]

Krzakala, M

F. Krzakala, M. Mézard, F. Sausset, Y. Sun, and L. Zdeborová. Probabilistic reconstruction in compressed sensing: algorithms, phase diagrams, and threshold achieving matrices.J. Stat. Mech. Theory Exp., (8), 2012

work page 2012
[24]

M. Ledoux. The concentration of measure phenomenon. Number 89. American Mathematical Soc., 2001

work page 2001
[25]

A. E. Litvak, A. Pajor, M. Rudelson, and N. Tomczak-Jaegermann. Smallest singular value of random matrices and geometry of random polytopes.Advances in Mathematics, 195(2):491–523, 2005

work page 2005
[26]

F. J. MacWilliams and N. J. A. Sloane.The theory of error-correcting codes, volume 16. Elsevier, 1977

work page 1977
[27]

Montanari

A. Montanari. Graphical models concepts in compressed sensing. In Y. C. Eldar and G. Kutyniok, editors, Compressed Sensing, pages 394–438. Cambridge University Press, 2012

work page 2012
[28]

Mousavi, A

A. Mousavi, A. Maleki, R. G. Baraniuk, et al. Consistent parameter estimation for lasso and approximate message passing. The Annals of Statistics, 46(1):119–148, 2018

work page 2018
[29]

Parikh, S

N. Parikh, S. Boyd, et al. Proximal algorithms.Foundations and TrendsR⃝ in Optimization, 1(3):127–239, 2014

work page 2014
[30]

L. D. Pitt. Positively correlated normal variables are associated.The Annals of Probability, pages 496–499, 1982

work page 1982
[31]

S. Rangan. Generalized approximate message passing for estimation with random linear mixing. InProc. IEEE Int. Symp. Inf. Theory, pages 2168–2172, 2011

work page 2011
[32]

R. T. Rockafellar and R. J.-B. Wets.Variational analysis, volume 317. Springer Science & Business Media, 2009

work page 2009
[33]

H. L. Royden.Real analysis. Krishna Prakashan Media, 1968

work page 1968
[34]

Rudin et al.Principles of mathematical analysis, volume 3

W. Rudin et al.Principles of mathematical analysis, volume 3. McGraw-hill New York, 1964

work page 1964
[35]

Rush and R

C. Rush and R. Venkataramanan. Finite sample analysis of approximate message passing algorithms. IEEE Trans. on Inf. Theory, 64(11):7264–7286, 2018

work page 2018
[36]

W. Su, M. Bogdan, and E. Candès. False discoveries occur early on the lasso path.The Annals of Statistics, 45(5):2133–2150, 2017

work page 2017
[37]

Su and E

W. Su and E. Candès. SLOPE is adaptive to unknown sparsity and asymptotically minimax.The Annals of Statistics, 44(3):1038–1068, 2016. 29

work page 2016
[38]

Tibshirani

R. Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58(1):267–288, 1996

work page 1996
[39]

Zeng and M

X. Zeng and M. A. Figueiredo. Decreasing weighted sortedℓ1 regularization. IEEE Signal Processing Letters, 21(10):1240–1244, 2014. 30 A State Evolution Analysis We ﬁrst prove Theorem 1 and then provide a proof of Proposition 2.6. A.1 Proving Theorem 1 Proof of Theorem 1.To begin with, we prove thatF(τ 2,ατ) deﬁned in(2.8) is concave with respect to τ 2. T...

work page 2014
[40]

We claim thatℓ∗ < 0

Then when τ 2 ∗ (α)→ +∞ asα↓Amin(δ), ℓ∗ := limα→amin ℓ(α) = limα→amin ( 1− 1 n E∥proxJατ∗(τ∗Z)∥∗ 0 { = 1− 1 n E∥proxJamin (Z)∥∗ 0 . We claim thatℓ∗ < 0. Using the deﬁnition of the vectorD and the setAmin(δ) in (2.7), ℓ∗ = 1− 1 n E∥proxJamin (Z)∥∗ 0 = 1− 1 δ E ⣨ 1 D(proxJamin (Z)) ⟩ < 1− 1 δp ∑ i E { 1 [D(proxJamin (Z))]i ( 1− ∑ j∈Ii [amin]j·| [proxJamin (...

work page
[41]

and let a2 = 1 2 α2( c 2 , t) where α1 and α2 are deﬁned by Lemma F.1. Then, P { min |s′|≤a1p min v(s′) ∥Xv∥ < a2 } ≤ e 1 2 pα1( c 2 ) E { max |s′|≤a1p P { min ∥v∥=1, supp∗(v)⊆s∪s′ ∥Xv∥ < 1 2 α2( c 2 , t) ⏐⏐⏐ St }} ≤ e 1 2 pα1( c 2 ) E { max |s′′|≤p(δ− c 2 ) P { min ∥v∥=1, supp∗(v)⊆s′′ ∥Xv∥ < 1 2 α2( c 2 , t) ⏐⏐⏐ St }} . Finally, using (cf. [5, Lemma 5.1]...

work page

[1] [1]

Bai and Y

Z. Bai and Y. Yin. Limit of the smallest eigenvalue of a large dimensional sample covariance matrix. In Advances In Statistics, pages 108–127. World Scientiﬁc, 2008

work page 2008

[2] [2]

R. F. Barber and E. J. Candès. Controlling the false discovery rate via knockoﬀs.The Annals of Statistics, 43(5):2055–2085, 2015

work page 2055

[3] [3]

Bayati, M

M. Bayati, M. A. Erdogdu, and A. Montanari. Estimating lasso risk and noise level. InAdvances in Neural Information Processing Systems, pages 944–952, 2013

work page 2013

[4] [4]

Bayati and A

M. Bayati and A. Montanari. The dynamics of message passing on dense graphs, with applications to compressed sensing. IEEE Trans. on Inf. Theory, 57(2):764–785, 2011

work page 2011

[5] [5]

Bayati and A

M. Bayati and A. Montanari. The lasso risk for gaussian matrices.IEEE Transactions on Information Theory, 58(4):1997–2017, 2011

work page 1997

[6] [6]

Beck and M

A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1):183–202, 2009

work page 2009

[7] [7]

P. C. Bellec, G. Lecué, and A. B. Tsybakov. SLOPE meets lasso: improved oracle bounds and optimality. The Annals of Statistics, 46(6B):3603–3642, 2018

work page 2018

[8] [8]

State Evolution for Approximate Message Passing with Non-Separable Functions

R. Berthier, A. Montanari, and P.-M. Nguyen. State evolution for approximate message passing with non-separable functions. arXiv preprint arXiv:1708.03950, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[9] [9]

Bogdan, E

M. Bogdan, E. Van Den Berg, C. Sabatti, W. Su, and E. J. Candès. SLOPE—adaptive variable selection via convex optimization.The Annals of Applied Statistics, 9(3):1103, 2015

work page 2015

[10] [10]

H. D. Bondell and B. J. Reich. Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with oscar.Biometrics, 64(1):115–123, 2008

work page 2008

[11] [11]

Brzyski, A

D. Brzyski, A. Gossmann, W. Su, and M. Bogdan. Group SLOPE—adaptive selection of groups of predictors. Journal of the American Statistical Association, pages 1–15, 2018

work page 2018

[12] [12]

Celentano and A

M. Celentano and A. Montanari. Fundamental barriers to high-dimensional regression with convex penalties. arXiv preprint arXiv:1903.10603, 2019

work page arXiv 1903

[13] [13]

Chambolle, R

A. Chambolle, R. A. De Vore, N.-Y. Lee, and B. J. Lucier. Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage.IEEE Transactions on Image Processing, 7(3):319–335, 1998

work page 1998

[14] [14]

Daubechies, M

I. Daubechies, M. Defrise, and C. De Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint.Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences, 57(11):1413–1457, 2004

work page 2004

[15] [15]

Donoho and A

D. Donoho and A. Montanari. High dimensional robust m-estimation: Asymptotic variance via approximate message passing.Probability Theory and Related Fields, 166(3-4):935–969, 2016. 28

work page 2016

[16] [16]

D. L. Donoho, A. Maleki, and A. Montanari. Message-passing algorithms for compressed sensing. Proceedings of the National Academy of Sciences, 106(45):18914–18919, 2009

work page 2009

[17] [17]

D. L. Donoho, A. Maleki, and A. Montanari. The noise-sensitivity phase transition in compressed sensing. IEEE Transactions on Information Theory, 57(10):6920–6941, 2011

work page 2011

[18] [18]

J. L. Doob.Stochastic processes, volume 101. New York Wiley, 1953

work page 1953

[19] [19]

Figueiredo and R

M. Figueiredo and R. Nowak. Ordered weighted l1 regularized regression with strongly correlated covariates: Theoretical aspects. InArtiﬁcial Intelligence and Statistics, pages 930–938, 2016

work page 2016

[20] [20]

Hu and Y

H. Hu and Y. M. Lu. Asymptotics and optimal designs of SLOPE for sparse linear regression.arXiv preprint arXiv:1903.11582, 2019

work page arXiv 1903

[21] [21]

Javanmard and A

A. Javanmard and A. Montanari. State evolution for general approximate message passing algorithms, with applications to spatial coupling.Information and Inference: A Journal of the IMA, 2(2):115–144, 2013

work page 2013

[22] [22]

B. S. Kashin. Diameters of some ﬁnite-dimensional sets and classes of smooth functions.Izvestiya Rossiiskoi Akademii Nauk. Seriya Matematicheskaya, 41(2):334–351, 1977

work page 1977

[23] [23]

Krzakala, M

F. Krzakala, M. Mézard, F. Sausset, Y. Sun, and L. Zdeborová. Probabilistic reconstruction in compressed sensing: algorithms, phase diagrams, and threshold achieving matrices.J. Stat. Mech. Theory Exp., (8), 2012

work page 2012

[24] [24]

M. Ledoux. The concentration of measure phenomenon. Number 89. American Mathematical Soc., 2001

work page 2001

[25] [25]

A. E. Litvak, A. Pajor, M. Rudelson, and N. Tomczak-Jaegermann. Smallest singular value of random matrices and geometry of random polytopes.Advances in Mathematics, 195(2):491–523, 2005

work page 2005

[26] [26]

F. J. MacWilliams and N. J. A. Sloane.The theory of error-correcting codes, volume 16. Elsevier, 1977

work page 1977

[27] [27]

Montanari

A. Montanari. Graphical models concepts in compressed sensing. In Y. C. Eldar and G. Kutyniok, editors, Compressed Sensing, pages 394–438. Cambridge University Press, 2012

work page 2012

[28] [28]

Mousavi, A

A. Mousavi, A. Maleki, R. G. Baraniuk, et al. Consistent parameter estimation for lasso and approximate message passing. The Annals of Statistics, 46(1):119–148, 2018

work page 2018

[29] [29]

Parikh, S

N. Parikh, S. Boyd, et al. Proximal algorithms.Foundations and TrendsR⃝ in Optimization, 1(3):127–239, 2014

work page 2014

[30] [30]

L. D. Pitt. Positively correlated normal variables are associated.The Annals of Probability, pages 496–499, 1982

work page 1982

[31] [31]

S. Rangan. Generalized approximate message passing for estimation with random linear mixing. InProc. IEEE Int. Symp. Inf. Theory, pages 2168–2172, 2011

work page 2011

[32] [32]

R. T. Rockafellar and R. J.-B. Wets.Variational analysis, volume 317. Springer Science & Business Media, 2009

work page 2009

[33] [33]

H. L. Royden.Real analysis. Krishna Prakashan Media, 1968

work page 1968

[34] [34]

Rudin et al.Principles of mathematical analysis, volume 3

W. Rudin et al.Principles of mathematical analysis, volume 3. McGraw-hill New York, 1964

work page 1964

[35] [35]

Rush and R

C. Rush and R. Venkataramanan. Finite sample analysis of approximate message passing algorithms. IEEE Trans. on Inf. Theory, 64(11):7264–7286, 2018

work page 2018

[36] [36]

W. Su, M. Bogdan, and E. Candès. False discoveries occur early on the lasso path.The Annals of Statistics, 45(5):2133–2150, 2017

work page 2017

[37] [37]

Su and E

W. Su and E. Candès. SLOPE is adaptive to unknown sparsity and asymptotically minimax.The Annals of Statistics, 44(3):1038–1068, 2016. 29

work page 2016

[38] [38]

Tibshirani

R. Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58(1):267–288, 1996

work page 1996

[39] [39]

Zeng and M

X. Zeng and M. A. Figueiredo. Decreasing weighted sortedℓ1 regularization. IEEE Signal Processing Letters, 21(10):1240–1244, 2014. 30 A State Evolution Analysis We ﬁrst prove Theorem 1 and then provide a proof of Proposition 2.6. A.1 Proving Theorem 1 Proof of Theorem 1.To begin with, we prove thatF(τ 2,ατ) deﬁned in(2.8) is concave with respect to τ 2. T...

work page 2014

[40] [40]

We claim thatℓ∗ < 0

Then when τ 2 ∗ (α)→ +∞ asα↓Amin(δ), ℓ∗ := limα→amin ℓ(α) = limα→amin ( 1− 1 n E∥proxJατ∗(τ∗Z)∥∗ 0 { = 1− 1 n E∥proxJamin (Z)∥∗ 0 . We claim thatℓ∗ < 0. Using the deﬁnition of the vectorD and the setAmin(δ) in (2.7), ℓ∗ = 1− 1 n E∥proxJamin (Z)∥∗ 0 = 1− 1 δ E ⣨ 1 D(proxJamin (Z)) ⟩ < 1− 1 δp ∑ i E { 1 [D(proxJamin (Z))]i ( 1− ∑ j∈Ii [amin]j·| [proxJamin (...

work page

[41] [41]

and let a2 = 1 2 α2( c 2 , t) where α1 and α2 are deﬁned by Lemma F.1. Then, P { min |s′|≤a1p min v(s′) ∥Xv∥ < a2 } ≤ e 1 2 pα1( c 2 ) E { max |s′|≤a1p P { min ∥v∥=1, supp∗(v)⊆s∪s′ ∥Xv∥ < 1 2 α2( c 2 , t) ⏐⏐⏐ St }} ≤ e 1 2 pα1( c 2 ) E { max |s′′|≤p(δ− c 2 ) P { min ∥v∥=1, supp∗(v)⊆s′′ ∥Xv∥ < 1 2 α2( c 2 , t) ⏐⏐⏐ St }} . Finally, using (cf. [5, Lemma 5.1]...

work page