pith. sign in

arxiv: 2510.05779 · v2 · submitted 2025-10-07 · 🧮 math.OC

The Golden Ratio Proximal ADMM with Norm Independent Step-Sizes for Separable Convex Optimization

Pith reviewed 2026-05-18 09:12 UTC · model grok-4.3

classification 🧮 math.OC
keywords golden ratio proximal ADMMstep-size strategiesnorm-independent stepsseparable convex optimizationglobal convergencesublinear ratesalternating direction method
0
0 comments X

The pith

Two new step-size rules let golden-ratio proximal ADMM solve separable convex problems without operator norm estimates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops two step-size selection strategies for the golden ratio proximal ADMM that solve linearly constrained separable convex optimization problems. The strategies use only cheap local information from the current iterate and require no backtracking or explicit norm computations. The second strategy can adapt by increasing the step-size and recovering from bad initial values. Global convergence of the iterates is proved along with sublinear rates under standard assumptions, and pointwise rates are obtained as well. The first rule reduces exactly to the fixed step-size version if the starting step-size is chosen small enough.

Core claim

The core discovery is that inexpensive local information suffices to choose step-sizes for the golden ratio proximal ADMM, yielding globally convergent algorithms with sublinear convergence rates for separable convex optimization problems with linear constraints.

What carries the argument

The golden ratio proximal ADMM equipped with the two norm-independent step-size strategies based on local iterate data.

If this is right

  • The generated iterates converge globally to an optimal solution.
  • Sublinear convergence rates are established for both proposed algorithms.
  • Pointwise convergence rates for the iterates are derived.
  • The first step-size rule coincides with a fixed-step-size method when the initial step-size is below a certain threshold.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • These local step-size choices could lower the setup cost for large-scale problems where computing global operator norms is expensive.
  • The ability to increase step-sizes over time may lead to faster practical convergence in applications with varying curvature.
  • Similar adaptive rules might be developed for other splitting methods that currently depend on norm bounds.

Load-bearing premise

The problem data satisfy the convexity and constraint conditions assumed in the standard convergence theory for proximal ADMM.

What would settle it

Running the algorithms on a simple convex optimization problem with known solution and observing that the iterates do not approach the solution at the claimed rate would falsify the convergence results.

Figures

Figures reproduced from arXiv: 2510.05779 by Santanu Soe, V. Vetrivel.

Figure 1
Figure 1. Figure 1: Feasibility residual and relative objective gap fo [PITH_FULL_IMAGE:figures/full_fig_p016_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: ROF denoising reconstruction by different algorit [PITH_FULL_IMAGE:figures/full_fig_p017_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Convergence plot for different methods: (a) relati [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: TV-regularized image deblurring recovered by fou [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Convergence plots for the image deblurring experi [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Convergence plot summaries for the unbalanced opt [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Transport plans as heatmaps (rows: sources, colum [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
read the original abstract

In this work, we propose two step-size strategies for the Golden ratio proximal ADMM (GrpADMM) to solve linearly constrained separable convex optimization problems. Both strategies eliminate explicit operator norm estimates by relying on inexpensive local information computed at the current iterate and requiring no backtracking. However, the key difference is that the second step-size strategy allows recovery from poor initial steps and can increase from iteration to iteration. Under standard assumptions, we establish global convergence of the generated iterates and derive sublinear convergence rates for both algorithms. We also obtain pointwise convergence rate results for the iterates of the algorithms. In addition, we show that the first proposed step-size rule for GrpADMM reduces to the fixed-step-size counterpart when the initial step-size is chosen below a certain threshold. Preliminary numerical experiments demonstrate the practical adaptability and effectiveness of the proposed approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript proposes two step-size strategies for the Golden ratio proximal ADMM (GrpADMM) to solve linearly constrained separable convex optimization problems. Both strategies use inexpensive local information at the current iterate to avoid explicit operator norm estimates and do not require backtracking. The second strategy allows the step-size to increase. Under standard assumptions, global convergence and sublinear convergence rates are established for both, along with pointwise convergence rates. The first strategy reduces to fixed-step ADMM below a certain initial step-size threshold. Preliminary numerical experiments are provided to demonstrate effectiveness.

Significance. If the convergence analysis holds, this contribution is significant for practical implementation of ADMM-type methods in large-scale optimization, as it removes the need for computing operator norms which can be costly or difficult. The local information-based rules and the increasing step-size option enhance adaptability. Credit is due for deriving both sublinear and pointwise rates, and for showing the reduction to the fixed-step case. This could be useful for practitioners dealing with separable convex problems.

major comments (1)
  1. [Convergence analysis section] The global convergence and rate proofs rely on the proposed local step-size rules satisfying the standard descent/Lyapunov inequality from proximal ADMM analyses. Explicitly confirm in the relevant theorem (around the statement of the two algorithms) that the rules meet this condition for all iterates without requiring additional boundedness or hidden assumptions on the local information.
minor comments (3)
  1. [Algorithm descriptions] Clarify the exact form of the local information used in the step-size rules (e.g., which norm or inner product appears) to make the algorithms fully reproducible from the pseudocode alone.
  2. [Numerical experiments] In the numerical section, add a brief discussion of how the increasing step-size rule recovers from poor initial choices, with reference to specific iteration counts or objective values in the tables/figures.
  3. [Main results] The abstract states that one rule 'reduces to the fixed-step-size counterpart'; include a short remark or corollary in the main text showing the threshold condition explicitly.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading, positive assessment, and constructive comment. We address the major comment below.

read point-by-point responses
  1. Referee: [Convergence analysis section] The global convergence and rate proofs rely on the proposed local step-size rules satisfying the standard descent/Lyapunov inequality from proximal ADMM analyses. Explicitly confirm in the relevant theorem (around the statement of the two algorithms) that the rules meet this condition for all iterates without requiring additional boundedness or hidden assumptions on the local information.

    Authors: We appreciate the suggestion to make this verification more explicit. In the revised manuscript we will add a dedicated remark (or short lemma) immediately after the statements of the two algorithms and prior to the main convergence theorems. The remark will confirm that both proposed local step-size rules satisfy the standard descent/Lyapunov inequality at every iterate. This follows directly from the definition of the rules, which select the step-size using only finite, well-defined local quantities (e.g., residual norms or inner products evaluated at the current iterate) and requires no additional boundedness assumptions on the operators, the local information, or the sequence of iterates beyond the standard convexity, closedness, and constraint-qualification assumptions already stated in the paper. The existing proofs already proceed by direct substitution of these local choices into the Lyapunov function; the added remark will simply isolate and highlight this verification for the reader. revision: yes

Circularity Check

0 steps flagged

Standard ADMM convergence analysis; no load-bearing reduction to self-fit or self-citation

full rationale

The paper introduces two norm-independent step-size rules for the existing GrpADMM framework and proves global convergence plus sublinear/pointwise rates under standard convex optimization assumptions. These proofs follow the usual Lyapunov descent arguments for proximal ADMM variants; the step-size choices are shown to satisfy the required inequalities directly from local iterate information without backtracking or operator norms. No quoted derivation reduces a claimed rate or convergence result to a fitted parameter or to a prior self-citation by construction. The analysis is self-contained against external benchmarks for this class of methods.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The paper relies on standard convexity and qualification assumptions common to ADMM convergence theory; no new free parameters beyond the choice of initial step-size threshold are introduced in the abstract, and no new entities are postulated.

free parameters (1)
  • initial step-size threshold
    The first strategy reduces to fixed-step ADMM when the initial step-size is chosen below a certain threshold; this threshold is a tunable parameter whose specific value is not derived from first principles.
axioms (1)
  • domain assumption Standard assumptions for convergence of proximal ADMM-type methods (convexity of objective, linear constraints, existence of solutions)
    Invoked to establish global convergence and sublinear rates; location referenced in abstract as 'under standard assumptions'.

pith-pipeline@v0.9.0 · 5676 in / 1408 out tokens · 26816 ms · 2026-05-18T09:12:05.526680+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

  1. [1]

    S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning , 3(1):1–122, 2011

  2. [2]

    Chambolle and T

    A. Chambolle and T. Pock. A first-order primal-dual algor ithm for convex problems with applications to imaging. Journal of mathematical imaging and vision , 40(1):120–145, 2011

  3. [3]

    Chang and J

    X. Chang and J. Yang. A golden ratio primal–dual algorith m for structured convex opti- mization. Journal of Scientific Computing , 87(2):47, 2021

  4. [4]

    L. Chapel. Introduction to unbalanced optimal transpor t and its efficient computational solu- tions. https://kantorovich.org/event/ki-seminar-chapel/Cha pel-slides.pdf, 2024. Slides; accessed 2025-09-14

  5. [5]

    C. Chen, R. H. Chan, S. Ma, and J. Yang. Inertial proximal a dmm for linearly constrained separable convex optimization. SIAM Journal on Imaging Sciences , 8(4):2239–2267, 2015

  6. [6]

    H. Chen, G. Gu, and J. Yang. A golden ratio proximal altern ating direction method of multipliers for separable convex optimization. Journal of Global Optimization , 87:581–602, 2023

  7. [7]

    Chizat, G

    L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. Unb alanced optimal transport: Dy- namic and kantorovich formulations. Journal of Functional Analysis , 274(11):3090–3123, 2018. 23

  8. [8]

    Eckstein

    J. Eckstein. Some saddle-function splitting methods fo r convex programming. Optimization Methods and Software , 4(1):75–83, 1994

  9. [9]

    Eckstein and D

    J. Eckstein and D. P. Bertsekas. On the douglas–rachford splitting method and the proximal point algorithm for maximal monotone operators. Mathematical Programming, 55:293–318, 1992

  10. [10]

    Gabay and B

    D. Gabay and B. Mercier. A dual algorithm for the solutio n of nonlinear variational problems via finite element approximation. Computers & Mathematics with Applications , 2:17–40, 1976

  11. [11]

    Glowinski and A

    R. Glowinski and A. Marroco. Sur l’approximation à l’ai de de méthodes de pénalisation- dualité. M2AN, 9:41–76, 1975

  12. [12]

    Goldstein and S

    T. Goldstein and S. Osher. The split bregman method for l 1-regularized problems. SIAM Journal on Imaging Sciences , 2(2):323–343, 2009

  13. [13]

    He, L.-Z

    B. He, L.-Z. Liao, D. Han, and H. Yang. A new inexact alter nating directions method for monotone variational inequalities. Mathematical Programming, 92(1):103–118, 2002

  14. [14]

    M. R. Hestenes. Multiplier and gradient methods. Journal of Optimization Theory and Applications, 4(5):303–320, 1969

  15. [15]

    Padcharoen, P

    A. Padcharoen, P. Kumam, and J. Martínez-Moreno. Augme nted lagrangian method for tv-l1-l2 based colour image restoration. Journal of Computational and Applied Mathematics , 354:507–519, 2019

  16. [16]

    Parikh and S

    N. Parikh and S. Boyd. Proximal algorithms. Foundations and Trends in Optimization , 1(3):127–239, 2014

  17. [17]

    Peyré and M

    G. Peyré and M. Cuturi. Computational Optimal Transport , volume 11. Foundations and Trends in Machine Learning, 2019

  18. [18]

    M. J. D. Powell. A method for nonlinear constraints in mi nimization problems. Academic Press, pages 283–298, 1969

  19. [19]

    Rockafellar

    R. Rockafellar. Convex analysis, princeton: Princeto n u, 1970

  20. [20]

    L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total var iation based noise removal algo- rithms. Physica D: nonlinear phenomena , 60(1-4):259–268, 1992

  21. [21]

    Shefi and M

    R. Shefi and M. Teboulle. Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization . SIAM Journal on Optimization , 24(1):269–297, 2014

  22. [22]

    Tao and X

    M. Tao and X. Yuan. Recovering low-rank and sparse compo nents of matrices from incom- plete and noisy observations. SIAM Journal on Optimization , 21(1):57–81, 2011

  23. [23]

    Tibshirani

    R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statis- tical Society Series B: Statistical Methodology , 58(1):267–288, 1996

  24. [24]

    C. Villani. Topics in Optimal Transportation . American Mathematical Society, 2003. 24

  25. [25]

    Wang and X

    X. Wang and X. Yuan. The linearized alternating directi on method of multipliers for dantzig selector. SIAM Journal on Scientific Computing , 34(5):A2792–A2811, 2012

  26. [26]

    Yang and X

    J. Yang and X. Yuan. Linearized augmented lagrangian an d alternating direction methods for nuclear norm minimization. Mathematics of computation , 82(281):301–329, 2013

  27. [27]

    Yin and J

    C. Yin and J. Yang. Golden ratio proximal gradient admm f or distributed composite convex optimization. Journal of Optimization Theory and Applications , 200(3):895–922, 2024

  28. [28]

    X. Yuan. Alternating direction method for covariance s election models. Journal of Scientific Computing, 51(2):261–273, 2012. 25