pith. sign in

arxiv: 1907.03823 · v1 · pith:ULPM77OXnew · submitted 2019-07-08 · 🧮 math.OC

New results on the local linear convergence of ADMM: a joint approach

Pith reviewed 2026-05-25 01:01 UTC · model grok-4.3

classification 🧮 math.OC
keywords ADMMlinear convergencespectral radiusmatrix producteigenvalue locationdistributed optimizationconvex problems
0
0 comments X

The pith

ADMM's local linear convergence rate is more precisely bounded by the spectral radius of a matrix product than by the product of separate matrix norms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops new theoretical tools to analyze the convergence of the alternating direction method of multipliers by focusing on the eigenstructure of two matrices that appear in its iteration. It introduces a technique for locating the eigenvalues of their symmetric product, which directly supplies the contraction factor that governs linear convergence. Because the approach solves the joint spectral-radius problem rather than bounding the product of two norms, the resulting speed estimate is tighter than prior results. A sympathetic reader would care because ADMM is widely used in distributed convex optimization, and sharper convergence guarantees can guide parameter selection without extensive empirical tuning.

Core claim

By identifying the spectral radius of the product of two matrices that arise in the ADMM iteration, rather than bounding the rate via the product of their individual norms, a sharper estimate of the local linear convergence speed is obtained. The new eigenvalue-location technique for symmetric matrix products makes this joint calculation feasible and yields the improved precision.

What carries the argument

The eigenvalue-location technique for the product of two symmetric matrices, which computes the spectral radius that controls the contraction of the ADMM iteration.

If this is right

  • Parameter selection for ADMM instances can rely on an exact rather than loose contraction bound.
  • Distributed convex problems solved by ADMM inherit the improved rate guarantee when the symmetry condition holds.
  • The same joint spectral-radius calculation can be reused across different problem instances that share the same matrix structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method may reduce the need for trial-and-error tuning by supplying a closed-form expression for the optimal step-size in symmetric cases.
  • If the symmetry assumption is relaxed, analogous joint bounds might be derived using numerical range or pseudospectral techniques.

Load-bearing premise

The convergence speed of ADMM is controlled by the spectral radius of the product of two specific matrices that arise in the algorithm, and those matrices are symmetric.

What would settle it

Apply ADMM to a convex problem whose two constituent matrices are known and symmetric; if the observed contraction factor in the iterates exceeds the spectral radius computed by the new location technique, the claimed link between that radius and the convergence rate is refuted.

Figures

Figures reproduced from arXiv: 1907.03823 by Tomaso Erseghe.

Figure 1
Figure 1. Figure 1: Function h(x), for x > −1. providing, and are also a straightforward consequence of the fact that we are avoiding [10, Assumption 2.ii], i.e., the request of Ai being a full-row-rank matrix. Incidentally, the presence of eigenvalues −1 is common in the very many optimisation problems where some of the variables are duplicated more than once. B. Smoothness + convexity scenario If we focus on a scenario wher… view at source ↗
Figure 2
Figure 2. Figure 2: Exemplification of operator Di for a piecewise linear function fi with increasing slopes mj < mj+1 and breaking points xk < xk+1. transform. To make our point, we investigate a context where variables are simply duplicated, and functions fi are expressed as a linear combination of one–dimensional convex contribu￾tions. This is equivalent to considering that E =  and Ai = a are scalar values, that b = 0 (f… view at source ↗
Figure 3
Figure 3. Figure 3: On the comparison between [11, Theorem 5.6] (dashed line) and (47) [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Locus N of the eigenvalues of N(α) in the complex plane (bold lines), under the assumptions of Theorem 1. both), or negative with value −n or −n˘ (either of the two, but not both). ✷ Proof: See Appendix A. We observe that, depending on the value of ci , the eigenval￾ues (52) can be real and positive, real and negative, or complex valued. The transition is controlled by the threshold values c = | p p1n2 − p… view at source ↗
Figure 5
Figure 5. Figure 5: Exemplification of the way the mixing constant [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Locus N of the eigenvalues of N(α) in the complex plane (bold lines show the boundaries), under the assumptions of Theorem 3. formalisation is suggesting the inappropriate choice q = 0, that is, we are not able to build a convergent algorithm. B. A more general statement Although the four–level assumption of (49) is linked to a very specific setting observed at convergence, its outcomes are more general th… view at source ↗
Figure 9
Figure 9. Figure 9: Convergence measure Γ = kz + − zk versus the iteration number n, compared with the two speed estimates given by ρmax (dashed lines) and λ ∗ max (dash-dotted lines). splitting in the time domain (as opposed to the classical frequency domain interpretation). The interesting outcome is that convergence speed is linked to the eigenvalues of matrices R(α) whose general structure is evidenced in [PITH_FULL_IMAG… view at source ↗
Figure 8
Figure 8. Figure 8: Locus N = R of the eigenvalues (in gray) containing the eigenvalues at the limit point z ∗ (dots), for  = 1 and q = 1. In the specific example we have λmax = 60.75 and λmin = 0.3465, providing p1 = 0.4853 and n1 = 0.9676. Note in [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
read the original abstract

Thanks to its versatility, its simplicity, and its fast convergence, ADMM is among the most widely used approaches for solving a convex problem in distributed form. However, making it running efficiently is an art that requires a fine tuning of system parameters according to the specific application scenario, and which ultimately calls for a thorough understanding of the hidden mechanisms that control the convergence behaviour. In this framework we aim at providing new theoretical insights on the convergence process and specifically on some constituent matrices of ADMM whose eigenstructure provides a close link with the algorithm's convergence speed. One of the key technique that we develop allows to effectively locate the eigenvalues of a (symmetric) matrix product, thus being able to estimate the contraction properties of ADMM. In the comparison with the results available from the literature, we are able to strengthen the precision of our speed estimate thanks to the fact that we are solving a joint problem (i.e., we are identifying the spectral radius of the product of two matrices) in place of two separate problems (the product of two matrix norms).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript claims to strengthen the analysis of local linear convergence for ADMM by examining the eigenstructure of its constituent matrices and their connection to convergence speed. It develops an eigenvalue-location technique applicable to symmetric matrix products, enabling tighter estimates of the contraction factor. The central improvement is obtained by computing the spectral radius of the product of two matrices (a joint problem) rather than the product of their individual norms (two separate problems), yielding a more precise speed bound than prior literature.

Significance. If the derivations and applicability conditions hold, the work provides a routine but useful tightening of linear-rate bounds for ADMM, a widely deployed algorithm. The joint spectral-radius formulation is mathematically standard and avoids the looseness of norm-product bounds; the eigenvalue-location result for symmetric products may find use in related operator-splitting analyses. No machine-checked proofs or reproducible code are mentioned, but the approach is internally consistent and free of evident circularity or ad-hoc parameters.

major comments (1)
  1. [Abstract and §2] Abstract and §2 (matrix definitions): the claim that the relevant matrices are symmetric (allowing the new eigenvalue-location technique) is load-bearing for the central result, yet the conditions on the underlying convex problem that guarantee symmetry of the iteration-matrix product are not stated explicitly; without this, the scope of the improved bound remains unclear.
minor comments (2)
  1. Notation for the two constituent matrices (call them A and B in the abstract) should be introduced with explicit definitions and dimensions before the spectral-radius claim is stated.
  2. [Abstract] The comparison paragraph in the abstract would be strengthened by citing the specific prior bounds (e.g., the norm-product results) that are being improved.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and the constructive comment on clarifying the symmetry conditions. We address the point below and will revise the manuscript to make the scope of the result explicit.

read point-by-point responses
  1. Referee: [Abstract and §2] Abstract and §2 (matrix definitions): the claim that the relevant matrices are symmetric (allowing the new eigenvalue-location technique) is load-bearing for the central result, yet the conditions on the underlying convex problem that guarantee symmetry of the iteration-matrix product are not stated explicitly; without this, the scope of the improved bound remains unclear.

    Authors: We agree that the conditions guaranteeing symmetry of the iteration-matrix product are load-bearing and should be stated explicitly. In the revised manuscript we will add a dedicated paragraph in §2 that lists the structural assumptions on the convex problem (e.g., the form of the quadratic objective terms and the constraint matrices) under which the product of the two ADMM iteration matrices is symmetric. This will delineate the precise scope of the eigenvalue-location technique and the tightened contraction-factor bound without changing any of the existing derivations. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper develops a technique for locating eigenvalues of symmetric matrix products to bound ADMM linear convergence rates, with the central improvement being the use of the spectral radius of a matrix product rather than a looser product-of-norms bound. No equations or steps reduce by construction to fitted parameters, self-definitions, or load-bearing self-citations; the eigenstructure-to-rate link and joint-problem formulation are presented as standard linear-algebra results applied to the ADMM iteration matrices. The derivation is therefore self-contained against external literature benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract alone supplies no concrete free parameters, axioms, or invented entities; the central claim rests on standard convex-analysis and linear-algebra background whose details are not visible.

pith-pipeline@v0.9.0 · 5708 in / 914 out tokens · 25177 ms · 2026-05-25T01:01:36.768558+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 2 internal anchors

  1. [1]

    S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, ser. Foundations and Trends in Machine Learning. Now Pubishers Inc, 2010, vol. 3, no. 1, pp. 1-122

  2. [2]

    On the O(1/n) convergence rate of the Douglas– Rachford alternating direction method,

    B. He and X. Yuan, “On the O(1/n) convergence rate of the Douglas– Rachford alternating direction method,” SIAM Journal on Numerical Analysis, vol. 50, no. 2, pp. 700–709, 2012

  3. [3]

    Fast alternat- ing direction optimization methods,

    T. Goldstein, B. O’Donoghue, S. Setzer, and R. Baraniuk, “Fast alternat- ing direction optimization methods,” SIAM Journal on Imaging Sciences, vol. 7, no. 3, pp. 1588–1623, 2014

  4. [4]

    On non-ergodic convergence rate of Douglas– Rachford alternating direction method of multipliers,

    B. He and X. Yuan, “On non-ergodic convergence rate of Douglas– Rachford alternating direction method of multipliers,”Numerische Math- ematik, vol. 130, no. 3, pp. 567–577, 2015

  5. [5]

    Convergence rate analysis of several splitting schemes,

    D. Davis and W. Yin, “Convergence rate analysis of several splitting schemes,” in Splitting Methods in Communication, Imaging, Science, and Engineering. Springer, 2016, pp. 115–163

  6. [6]

    Optimal parameter selection for the alternating direction method of multipliers (ADMM): quadratic problems,

    E. Ghadimi, A. Teixeira, I. Shames, and M. Johansson, “Optimal parameter selection for the alternating direction method of multipliers (ADMM): quadratic problems,” IEEE Transactions on Automatic Con- trol, vol. 60, no. 3, pp. 644–658, 2015

  7. [7]

    A General Analysis of the Convergence of ADMM

    R. Nishihara, L. Lessard, B. Recht, A. Packard, and M. I. Jordan, “A general analysis of the convergence of ADMM,” arXiv preprint arXiv:1502.02009, 2015

  8. [8]

    On the global and linear convergence of the generalized alternating direction method of multipliers,

    W. Deng and W. Yin, “On the global and linear convergence of the generalized alternating direction method of multipliers,” Journal of Scientific Computing, vol. 66, no. 3, pp. 889–916, 2016

  9. [9]

    Faster convergence rates of relaxed Peaceman- Rachford and ADMM under regularity assumptions,

    D. Davis and W. Yin, “Faster convergence rates of relaxed Peaceman- Rachford and ADMM under regularity assumptions,” Mathematics of Operations Research, vol. 42, no. 3, pp. 783–805, 2017

  10. [10]

    Linear convergence and metric selection for Douglas-Rachford splitting and ADMM,

    P. Giselsson and S. Boyd, “Linear convergence and metric selection for Douglas-Rachford splitting and ADMM,” IEEE Transactions on Automatic Control, vol. 62, no. 2, pp. 532–544, 2017

  11. [11]

    Tight global linear convergence rate bounds for Douglas– Rachford splitting,

    P. Giselsson, “Tight global linear convergence rate bounds for Douglas– Rachford splitting,” Journal of Fixed Point Theory and Applications , vol. 19, no. 4, pp. 2241–2270, 2017

  12. [12]

    On the linear convergence of the alternating direction method of multipliers,

    M. Hong and Z.-Q. Luo, “On the linear convergence of the alternating direction method of multipliers,” Mathematical Programming, vol. 162, no. 1-2, pp. 165–199, 2017

  13. [13]

    Local convergence properties of Douglas–Rachford and alternating direction method of multipliers,

    J. Liang, J. Fadili, and G. Peyr ´e, “Local convergence properties of Douglas–Rachford and alternating direction method of multipliers,” Journal of Optimization Theory and Applications , vol. 172, no. 3, pp. 874–913, 2017

  14. [14]

    On the Sublinear Convergence Rate of Multi-Block ADMM

    T. Lin, S. Ma, and S. Zhang, “On the convergence rate of multi-block ADMM,” arXiv preprint arXiv:1408.4265 , vol. 229, 2014

  15. [15]

    The direct extension of ADMM for multi-block convex minimization problems is not necessarily conver- gent,

    C. Chen, B. He, Y . Ye, and X. Yuan, “The direct extension of ADMM for multi-block convex minimization problems is not necessarily conver- gent,” Mathematical Programming, vol. 155, no. 1-2, pp. 57–79, 2016

  16. [16]

    Parallel multi-block ADMM with o(1/k) convergence,

    W. Deng, M.-J. Lai, Z. Peng, and W. Yin, “Parallel multi-block ADMM with o(1/k) convergence,” Journal of Scientific Computing , vol. 71, no. 2, pp. 712–736, 2017

  17. [17]

    On the linear convergence of the ADMM in decentralized consensus optimization

    W. Shi, Q. Ling, K. Yuan, G. Wu, and W. Yin, “On the linear convergence of the ADMM in decentralized consensus optimization.” IEEE Trans. Signal Processing , vol. 62, no. 7, pp. 1750–1761, 2014

  18. [18]

    Linear convergence rate of a class of distributed augmented Lagrangian algorithms,

    D. Jakoveti ´c, J. M. Moura, and J. Xavier, “Linear convergence rate of a class of distributed augmented Lagrangian algorithms,” IEEE Transactions on Automatic Control , vol. 60, no. 4, pp. 922–936, 2015

  19. [19]

    Explicit convergence rate of a distributed alternating direction method of multipliers,

    F. Iutzeler, P. Bianchi, P. Ciblat, and W. Hachem, “Explicit convergence rate of a distributed alternating direction method of multipliers,” IEEE Transactions on Automatic Control , vol. 61, no. 4, pp. 892–904, 2016

  20. [20]

    Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods,

    H. Attouch, J. Bolte, and B. F. Svaiter, “Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods,” Mathematical Programming, vol. 137, no. 1-2, pp. 91–129, 2013

  21. [21]

    Global convergence of ADMM in nonconvex nonsmooth optimization,

    Y . Wang, W. Yin, and J. Zeng, “Global convergence of ADMM in nonconvex nonsmooth optimization,” Journal of Scientific Computing , pp. 1–35, 2015. SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING 14

  22. [22]

    Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems,

    G. Li and T. K. Pong, “Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems,” Math- ematical programming, vol. 159, no. 1-2, pp. 371–401, 2016

  23. [23]

    R. T. Rockafellar and R. J.-B. Wets, Variational analysis. Springer Science & Business Media, 2009, vol. 317

  24. [24]

    H. H. Bauschke, P. L. Combettes et al., Convex analysis and monotone operator theory in Hilbert spaces . Springer, 2011, vol. 408

  25. [25]

    On the simultaneous diagonalization of two semi- definite matrices,

    R. W. Newcomb, “On the simultaneous diagonalization of two semi- definite matrices,”Quarterly of Applied Mathematics , vol. 19, no. 2, pp. 144–146, 1961

  26. [26]

    R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge university press, 2013, second Edition

  27. [27]

    Op- timal rates of linear convergence of relaxed alternating projections and generalized Douglas-Rachford methods for two subspaces,

    H. H. Bauschke, J. B. Cruz, T. T. Nghia, H. M. Pha, and X. Wang, “Op- timal rates of linear convergence of relaxed alternating projections and generalized Douglas-Rachford methods for two subspaces,” Numerical Algorithms, vol. 73, no. 1, pp. 33–76, 2016

  28. [28]

    Eventual linear convergence of the Douglas- Rachford iteration for basis pursuit,

    L. Demanet and X. Zhang, “Eventual linear convergence of the Douglas- Rachford iteration for basis pursuit,” Mathematics of Computation , vol. 85, no. 297, pp. 209–238, 2016

  29. [29]

    Numerical methods for computing angles between linear subspaces,

    k. Bj ¨orck and G. H. Golub, “Numerical methods for computing angles between linear subspaces,” Mathematics of computation , vol. 27, no. 123, pp. 579–594, 1973

  30. [30]

    G. H. Golub and C. F. Van Loan, Matrix computations . JHU press, 2012, vol. 3

  31. [31]

    Spectral properties of the Google matrix of the World Wide Web and other directed networks,

    B. Georgeot, O. Giraud, and D. L. Shepelyansky, “Spectral properties of the Google matrix of the World Wide Web and other directed networks,” Physical Review E , vol. 81, no. 5, p. 056109, 2010

  32. [32]

    On weakly positive matrices,

    E. P. Wigner, “On weakly positive matrices,” in The Collected Works of Eugene Paul Wigner. Springer, 1993, pp. 559–563

  33. [33]

    Weakly positive definite matrices. simula research laboratory,

    T. Nilssen, “Weakly positive definite matrices. simula research laboratory,” Research Report 07–2005. URL: http://www. simula. no/departments/scientific , Tech. Rep., 2005