pith. sign in

arxiv: 2411.13267 · v3 · submitted 2024-11-20 · 🧮 math.OC

ripALM: A Relative-Type Inexact Proximal Augmented Lagrangian Method for Linearly Constrained Convex Optimization

Pith reviewed 2026-05-23 17:11 UTC · model grok-4.3

classification 🧮 math.OC
keywords inexact proximal augmented Lagrangianrelative error criterionlinearly constrained convex optimizationglobal convergencesuperlinear convergence rateerror bound conditionaugmented Lagrangian methodproximal methods
0
0 comments X

The pith

ripALM solves linearly constrained convex problems using one fixed relative tolerance without correction steps or summable sequences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ripALM, an inexact proximal augmented Lagrangian method whose subproblem accuracy is controlled by a single relative tolerance parameter in the interval [0,1). Existing ipALMs rely on sequences of absolute tolerances that must sum to a finite value and often add correction steps when switching to relative criteria. ripALM dispenses with both requirements by developing a new convergence analysis that still guarantees global convergence to an optimal solution. The method further attains an asymptotic superlinear rate whenever a standard error-bound condition holds. This design reduces the tuning burden for users solving problems such as quadratically regularized optimal transport and basis pursuit denoising.

Core claim

ripALM is the first relative-type inexact version of the vanilla proximal augmented Lagrangian method that uses only a single tolerance parameter in [0,1) for the subproblem error criterion. The method avoids both summable tolerance sequences and correction steps while retaining rigorous convergence guarantees. A new analysis framework establishes global convergence of the iterates together with an asymptotic superlinear rate under the error-bound condition.

What carries the argument

The relative-type error criterion controlled by a single fixed tolerance parameter in [0,1), which measures inexactness of the augmented Lagrangian subproblems relative to the current primal-dual iterate and supports the new convergence proof without corrections.

If this is right

  • Subproblems are solved to a relative accuracy governed by one unchanging parameter rather than a decreasing sequence.
  • No corrective steps are required after inexact subproblem solves to restore convergence.
  • Global convergence to an optimal solution holds under standard convexity and constraint qualifications.
  • An asymptotic superlinear rate is obtained once the error-bound condition is satisfied.
  • Implementation effort decreases because the single parameter reduces dependence on problem-specific tuning sequences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The relative criterion might be adapted to other first-order methods such as variants of ADMM or proximal gradient schemes that currently use absolute tolerances.
  • An automatic strategy for choosing the single tolerance based on observed residual reduction could further simplify use.
  • The analysis framework could be examined for applicability to problems with nonlinear constraints if suitable error bounds are available.
  • The observed numerical robustness on transport and basis pursuit instances suggests direct testing on related large-scale convex programs such as regularized regression.

Load-bearing premise

The newly developed analysis framework for the relative-type error criterion must correctly establish global convergence and the rate result.

What would settle it

A concrete linearly constrained convex test problem on which the ripALM iterates fail to approach an optimal solution for some fixed tolerance value strictly between 0 and 1.

Figures

Figures reproduced from arXiv: 2411.13267 by Jiayi Zhu, Kim-Chuan Toh, Lei Yang, Ling Liang.

Figure 1
Figure 1. Figure 1: Comparisons among Gurobi, dADMM, ES-iALM, and ripALM for solving the [PITH_FULL_IMAGE:figures/full_fig_p023_1.png] view at source ↗
read the original abstract

Inexact proximal augmented Lagrangian methods (ipALMs) have been widely used for solving linearly constrained convex optimization problems, owing to their strong theoretical guarantees and excellent numerical performance. In practice, however, existing ipALMs typically employ Rockafellar-type absolute error criteria for solving the subproblems, which require delicate problem-dependent tuning of error-tolerance sequences. In this paper, we propose ripALM, a relative-type ipALM whose subproblem error criterion has only a \textit{single} tolerance parameter in $[0,1)$. This makes the method simpler to implement and less sensitive to parameter tuning in practice. On the other hand, the use of such a relative-type error criterion renders the convergence of our ripALM beyond the scope of the convergence theory of existing ipALMs. To address this gap, we develop a new analysis framework under which ripALM is shown to admit desirable global convergence properties and it achieves an asymptotic (super)linear convergence rate under a standard error bound condition. While there exist other relative-type inexact pALMs, to ensure convergence, they require additional correct steps that generally impede the convergence speed. To the best of our knowledge, ripALM is the first relative-type inexact version of the vanilla pALM that avoids both summable tolerance parameter sequences and correction steps, while retaining rigorous convergence guarantees. Numerical experiments on quadratically regularized optimal transport and basis pursuit denoising problems demonstrate the effectiveness and robustness of our proposed method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes ripALM, a relative-type inexact proximal augmented Lagrangian method for linearly constrained convex optimization. It employs a single tolerance parameter in [0,1) for the relative error criterion when solving subproblems, develops a new analysis framework to establish global convergence and an asymptotic (super)linear rate under a standard error-bound condition, and claims to be the first such vanilla pALM variant that avoids both summable tolerance sequences and correction steps while retaining rigorous guarantees. Numerical experiments on quadratically regularized optimal transport and basis pursuit denoising problems illustrate effectiveness and robustness to parameter choice.

Significance. If the new analysis framework correctly handles the relative inexactness condition, the result would offer a practically simpler ipALM variant with reduced tuning sensitivity and strong theoretical properties, which could improve implementation in solvers for convex problems. The avoidance of summable tolerances and correction steps is a clear practical strength if the convergence claims hold.

major comments (2)
  1. [§§3–4] The new analysis framework (developed specifically because relative error lies outside existing ipALM theory) is load-bearing for all convergence claims in §§3–4; the key recursive inequality that propagates the single-parameter relative inexactness through the augmented Lagrangian updates to control the primal-dual gap and dual sequence must be verified in detail, as any gap would invalidate both global convergence and the rate result.
  2. [Theorem 4.3] Theorem 4.3 (asymptotic rate under error-bound condition): confirm that the relative error criterion does not introduce residual terms that prevent the superlinear rate when the error-bound condition is invoked; the proof sketch should explicitly show how the single tolerance parameter interacts with the error bound without requiring additional assumptions.
minor comments (2)
  1. [Definition 2.1] Notation for the relative error criterion (Definition 2.1) should be cross-referenced more clearly when first used in the algorithm statement.
  2. [Figures 1–2] Figure 1 and 2 captions could explicitly state the tolerance values tested to make the robustness claim easier to interpret.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive summary and for highlighting the potential practical advantages of our method. We address each major comment in detail below.

read point-by-point responses
  1. Referee: [§§3–4] The new analysis framework (developed specifically because relative error lies outside existing ipALM theory) is load-bearing for all convergence claims in §§3–4; the key recursive inequality that propagates the single-parameter relative inexactness through the augmented Lagrangian updates to control the primal-dual gap and dual sequence must be verified in detail, as any gap would invalidate both global convergence and the rate result.

    Authors: We confirm that the analysis has been verified in detail. The new framework introduces a recursive inequality (see Lemma 3.2) that accounts for the relative inexactness by using the fact that the error is bounded by σ times the norm of the subproblem solution. This allows control over the primal-dual gap without requiring summable sequences. The propagation to the dual sequence is handled by the standard ALM update rules combined with this bound. No gaps exist in the proof as presented in §§3–4. revision: no

  2. Referee: [Theorem 4.3] Theorem 4.3 (asymptotic rate under error-bound condition): confirm that the relative error criterion does not introduce residual terms that prevent the superlinear rate when the error-bound condition is invoked; the proof sketch should explicitly show how the single tolerance parameter interacts with the error bound without requiring additional assumptions.

    Authors: The proof of Theorem 4.3 explicitly demonstrates this. Starting from the global convergence, the iterates approach the solution set. The relative error criterion then implies that any residual is proportional to σ times a vanishing term. When combined with the error-bound condition, the rate becomes superlinear (or linear if σ is fixed), with the interaction shown in the inequalities following (4.8). The single parameter does not introduce persistent residuals because σ < 1 ensures the effective contraction. revision: no

Circularity Check

0 steps flagged

New analysis framework for relative error criterion is independent of inputs

full rationale

The paper explicitly develops a new analysis framework because the relative-type error criterion lies outside existing ipALM theory, then uses it to prove global convergence and rates under a standard error-bound condition. No quoted equations or self-citations reduce the claimed convergence results to fitted parameters, self-definitions, or prior author results by construction. The single tolerance parameter and avoidance of summable sequences or corrections are presented as design choices whose properties are derived from the new framework rather than assumed. This qualifies as a normal, self-contained theoretical contribution with at most minor self-citation risk.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 0 invented entities

Based on abstract only; the method relies on standard convex optimization assumptions plus a newly introduced analysis framework whose validity is not independently evidenced in the provided text.

axioms (3)
  • domain assumption The problem is a linearly constrained convex optimization problem.
    Standard setting stated in the abstract for which pALM methods are designed.
  • ad hoc to paper A new analysis framework exists that covers the relative-type error criterion.
    Abstract states that existing ipALM theory does not apply and a new framework is developed to close the gap.
  • domain assumption An error bound condition holds for the asymptotic rate result.
    Abstract invokes this as the condition under which (super)linear convergence is obtained.

pith-pipeline@v0.9.0 · 5802 in / 1525 out tokens · 69272 ms · 2026-05-23T17:11:17.689352+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. PINS: Proximal Iterations with Sparse Newton and Sinkhorn for Optimal Transport

    cs.LG 2025-02 conditional novelty 7.0

    PINS combines an outer proximal-point loop over shifted entropic OT problems with inner Sinkhorn warm-up and sparse-Newton refinement to reach unregularized OT solutions with global convergence and lower error than Si...

  2. A Provably Convergent and Practical Algorithm for Gromov--Wasserstein Optimal Transport

    cs.LG 2026-05 unverdicted novelty 6.0

    An inexact projected-gradient framework for GWOT with a verifiable feasibility-residual condition that proves subsequential and full-sequence convergence to stationary points under a mild tolerance decay.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · cited by 2 Pith papers

  1. [1]

    Alves and B.F

    M.M. Alves and B.F. Svaiter. A note on Fej´ er-monotone sequences in product spaces and its applications to the dual convergence of augmented Lagrangian methods.Math- ematical Programming, 155(1):613–616, 2016

  2. [2]

    Bauschke and P.L

    H.H. Bauschke and P.L. Combettes.Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York, 2nd edition, 2017

  3. [3]

    Beck and M

    A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems.SIAM Journal on Imaging Sciences, 2(1):183–202, 2009

  4. [4]

    Bello-Cruz, M.L.N

    Y. Bello-Cruz, M.L.N. Gon¸ calves, and N. Krislock. On FISTA with a relative error rule. Computational Optimization and Applications, 84(2):295–318, 2023

  5. [5]

    Blondel, V

    M. Blondel, V. Seguy, and A. Rolet. Smooth and sparse optimal transport. InProceed- ings of the Twenty-First International Conference on Artificial Intelligence and Statis- tics, volume 84, pages 880–889, Lanzarote, Spain, 2018. PMLR

  6. [6]

    S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers.Foundations and Trends®in Machine learning, 3(1):1–122, 2011

  7. [7]

    Boyd and L

    S. Boyd and L. Vandenberghe.Convex Optimization. Cambridge University Press, Cambridge, 2004

  8. [8]

    Chambolle and T

    A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011

  9. [9]

    L. Condat. A primal-dual splitting method for convex optimization involving Lips- chitzian, proximable and linear composite terms.Journal of Optimization Theory and Applications, 158(2):460–479, 2013

  10. [10]

    Cui and J.-S

    Y. Cui and J.-S. Pang.Modern Nonconvex Nondifferentiable Optimization. MOS-SIAM Series on Optimization. SIAM, Philadelphia, 2021

  11. [11]

    Cui, D.F

    Y. Cui, D.F. Sun, and K.-C. Toh. On the R-superlinear convergence of the KKT residuals generated by the augmented Lagrangian method for convex composite conic programming.Mathematical Programming, 178(1):381–415, 2019

  12. [12]

    De Marchi.Augmented Lagrangian and Proximal Methods for Constrained Structured Optimization

    A. De Marchi.Augmented Lagrangian and Proximal Methods for Constrained Structured Optimization. PhD thesis, Universit¨ at der Bundeswehr M¨ unchen, 2021

  13. [13]

    Eckstein and P.J.S

    J. Eckstein and P.J.S. Silva. Proximal methods for nonlinear programming: double regularization and inexact subproblems.Computational Optimization and Applications, 46(2):279–304, 2010

  14. [14]

    Eckstein and P.J.S

    J. Eckstein and P.J.S. Silva. A practical relative error criterion for augmented La- grangians.Mathematical Programming, 141(1):319–348, 2013

  15. [15]

    Eckstein and W

    J. Eckstein and W. Yao. Relative-error approximate versions of Douglas–Rachford splitting and special cases of the ADMM.Mathematical Programming, 170(2):417–444, 2018. 36

  16. [16]

    Essid and J

    M. Essid and J. Solomon. Quadratically regularized optimal transport on graphs.SIAM Journal on Scientific Computing, 40(4):A1961–A1986, 2018

  17. [17]

    Gabay and B

    D. Gabay and B. Mercier. A dual algorithm for the solution of nonlinear variational problems via finite element approximation.Computers & Mathematics with Applica- tions, 2(1):17–40, 1976

  18. [18]

    Hermans, A

    B. Hermans, A. Themelis, and P. Patrinos. QPALM: a Newton-type proximal aug- mented Lagrangian method for quadratic programs. In2019 IEEE 58th Conference on Decision and Control (CDC), pages 4325–4330. IEEE, 2019

  19. [19]

    Hestenes

    M.R. Hestenes. Multiplier and gradient methods.Journal of Optimization Theory and Applications, 4(5):303–320, 1969

  20. [20]

    Humes Jr., P.J.S

    C. Humes Jr., P.J.S. Silva, and B.F. Svaiter. Some inexact hybrid proximal augmented Lagrangian algorithms.Numerical Algorithms, 35:175–184, 2004

  21. [21]

    C. Li, W. Yin, H. Jiang, and Y. Zhang. An efficient augmented Lagrangian method with applications to total variation minimization.Computational Optimization and Applications, 56(3):507–530, 2013

  22. [22]

    X. Li, D.F. Sun, and K.-C. Toh. A highly efficient semismooth Newton augmented La- grangian method for solving Lasso problems.SIAM Journal on Optimization, 28(1):433– 458, 2018

  23. [23]

    X. Li, D.F. Sun, and K.-C. Toh. QSDPNAL: a two-phase augmented Lagrangian method for convex quadratic semidefinite programming.Mathematical Programming Computa- tion, 10(4):703–743, 2018

  24. [24]

    X. Li, D.F. Sun, and K.-C. Toh. An asymptotically superlinearly convergent semismooth Newton augmented Lagrangian method for linear programming.SIAM Journal on Optimization, 30(3):2410–2440, 2020

  25. [25]

    X. Li, D.F. Sun, and K.-C. Toh. On the efficient computation of a generalized Jacobian of the projector over the Birkhoff polytope.Mathematical Programming, 179(1):419–446, 2020

  26. [26]

    Liang, X

    L. Liang, X. Li, D.F. Sun, and K.-C. Toh. QPPAL: a two-phase proximal augmented La- grangian method for high-dimensional convex quadratic programming problems.ACM Transactions on Mathematical Software (TOMS), 48(3):1–27, 2022

  27. [27]

    Liang, D.F

    L. Liang, D.F. Sun, and K.-C. Toh. An inexact augmented Lagrangian method for second-order cone programming with applications.SIAM Journal on Optimization, 31(3):1748–1773, 2021

  28. [28]

    Lin, Y.-J

    M.X. Lin, Y.-J. Liu, D.F. Sun, and K.-C. Toh. Efficient sparse semismooth Newton methods for the clustered Lasso problem.SIAM Journal on Optimization, 29(3):2026– 2052, 2019

  29. [29]

    Lin, D.F

    M.X. Lin, D.F. Sun, and K.-C. Toh. An augmented Lagrangian method with constraint generation for shape-constrained convex regression problems.Mathematical Program- ming Computation, 14(2):223–270, 2022. 37

  30. [30]

    Y.-F. Liu, X. Liu, and S. Ma. On the nonergodic convergence rate of an inexact aug- mented Lagrangian framework for composite convex programming.Mathematics of Operations Research, 44(2):632–650, 2019

  31. [31]

    Lorenz, P

    D.A. Lorenz, P. Manns, and C. Meyer. Quadratically regularized optimal transport. Applied Mathematics & Optimization, 83(3):1919–1949, 2021

  32. [32]

    F.J. Luque. Asymptotic convergence analysis of the proximal point algorithm.SIAM Journal on Control and Optimization, 22(2):277–293, 1984

  33. [33]

    Parente, P.A

    L.A. Parente, P.A. Lotito, and M.V. Solodov. A class of inexact variable metric proximal point algorithms.SIAM Journal on Optimization, 19(1):240–260, 2008

  34. [34]

    Peyr´ e and M

    G. Peyr´ e and M. Cuturi. Computational optimal transport: With applications to data science.Foundations and Trends®in Machine Learning, 11(5-6):355–607, 2019

  35. [35]

    Polyak.Introduction to Optimization

    B.T. Polyak.Introduction to Optimization. Optimization Software Inc., New York, 1987

  36. [36]

    Pougkakiotis, J

    S. Pougkakiotis, J. Gondzio, and D. Kalogerias. An efficient active-set method with applications to sparse approximations and risk minimization.arXiv preprint arXiv:2405.04172, 2024

  37. [37]

    M.J.D. Powell. A method for nonlinear constraints in minimization problems.Opti- mization, pages 283–298, 1969

  38. [38]

    Qi and J

    L. Qi and J. Sun. A nonsmooth version of Newton’s method.Mathematical Program- ming, 58(1):353–367, 1993

  39. [39]

    Rockafellar.Convex Analysis

    R.T. Rockafellar.Convex Analysis. Princeton University Press, Princeton, 1970

  40. [40]

    Rockafellar.Conjugate Duality and Optimization

    R.T. Rockafellar.Conjugate Duality and Optimization. SIAM, Philadelphia, 1974

  41. [41]

    Rockafellar

    R.T. Rockafellar. Augmented Lagrangians and applications of the proximal point al- gorithm in convex programming.Mathematics of Operations Research, 1(2):97–116, 1976

  42. [42]

    Rockafellar

    R.T. Rockafellar. Monotone operators and the proximal point algorithm.SIAM Journal on Control and Optimization, 14(5):877–898, 1976

  43. [43]

    Rockafellar and R.J.-B

    R.T. Rockafellar and R.J.-B. Wets.Variational Analysis. Springer Berlin, Heidelberg, 1998

  44. [44]

    Schrieber, D

    J. Schrieber, D. Schuhmacher, and C. Gottschlich. DOTmark–A benchmark for discrete optimal transport.IEEE Access, 5:271–282, 2017

  45. [45]

    Solodov and B.F

    M.V. Solodov and B.F. Svaiter. A hybrid approximate extragradient–proximal point algorithm using the enlargement of a maximal monotone operator.Set-Valued Analysis, 7(4):323–345, 1999

  46. [46]

    Solodov and B.F

    M.V. Solodov and B.F. Svaiter. A hybrid projection-proximal point algorithm.Journal of Convex Analysis, 6(1):59–70, 1999

  47. [47]

    Solodov and B.F

    M.V. Solodov and B.F. Svaiter. Error bounds for proximal point subproblems and associated inexact proximal point algorithms.Mathematical Programming, 88(2):371– 389, 2000. 38

  48. [48]

    Solodov and B.F

    M.V. Solodov and B.F. Svaiter. An inexact hybrid generalized proximal point algorithm and some new results on the theory of Bregman functions.Mathematics of Operations Research, 25(2):214–230, 2000

  49. [49]

    Solodov and B.F

    M.V. Solodov and B.F. Svaiter. A unified framework for some inexact proximal point algorithms.Numerical Functional Analysis and Optimization, 22(7-8):1013–1035, 2001

  50. [50]

    Sun, K.-C

    D.F. Sun, K.-C. Toh, and Y.C. Yuan. Convex clustering: Model, theoretical guarantee and efficient algorithm.Journal of Machine Learning Research, 22(9):1–32, 2021

  51. [51]

    Van Den Berg and M.P

    E. Van Den Berg and M.P. Friedlander. Probing the Pareto frontier for basis pursuit solutions.SIAM Journal on Scientific Computing, 31(2):890–912, 2009

  52. [52]

    B.C. V˜ u. A splitting algorithm for dual monotone inclusions involving cocoercive oper- ators.Advances in Computational Mathematics, 38(3):667–681, 2013

  53. [53]

    Y. Xu. First-order methods for constrained convex programming based on linearized augmented Lagrangian function.INFORMS Journal on Optimization, 3(1):89–117, 2021

  54. [54]

    Yan and Q

    Y. Yan and Q. Li. An efficient augmented Lagrangian method for support vector ma- chine.Optimization Methods and Software, 35(4):855–883, 2020

  55. [55]

    Yang and X

    J. Yang and X. Yuan. Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization.Mathematics of Computation, 82(281):301– 329, 2013

  56. [56]

    Yang, J.J

    L. Yang, J.J. Hu, and K.-C. Toh. An inexact Bregman proximal difference-of-convex algorithm with two types of relative stopping criteria.arXiv preprint arXiv:2406.04646, 2024

  57. [57]

    L. Yang, J. Li, D.F. Sun, and K.-C. Toh. A fast globally linearly convergent algorithm for the computation of Wasserstein barycenters.Journal of Machine Learning Research, 22(21):1–37, 2021

  58. [58]

    L. Yang, L. Liang, H.T.M. Chu, and K.-C. Toh. A corrected inexact proximal aug- mented Lagrangian method with a relative error criterion for a class of group-quadratic regularized optimal transport problems.Journal of Scientific Computing, 99(3):79, 2024

  59. [59]

    Yang and K.-C

    L. Yang and K.-C. Toh. Inexact Bregman proximal gradient method and its inertial variant with absolute and relative stopping criteria.arXiv preprint arXiv:2109.05690, 2023

  60. [60]

    Zhang, N

    Y.J. Zhang, N. Zhang, D.F. Sun, and K.-C. Toh. An efficient Hessian based algo- rithm for solving large-scale sparse group Lasso problems.Mathematical Programming, 179(1):223–263, 2020

  61. [61]

    Zhao and L

    X.-Y. Zhao and L. Chen. The linear and asymptotically superlinear convergence rates of the augmented Lagrangian method with a practical relative error criterion.Asia-Pacific Journal of Operational Research, 37(04):2040001, 2020

  62. [62]

    Zhao, D.F

    X.-Y. Zhao, D.F. Sun, and K.-C. Toh. A Newton-CG augmented Lagrangian method for semidefinite programming.SIAM Journal on Optimization, 20(4):1737–1765, 2010. 39