Convergence of iterates and improved rates for accelerated augmented Lagrangian methods for linearly constrained convex optimization

Nan-Jing Hang; Xin He; Ya-Ping Fang; Yi-Bin Xiao

arxiv: 2605.19467 · v1 · pith:2KS5VGEXnew · submitted 2026-05-19 · 🧮 math.OC

Convergence of iterates and improved rates for accelerated augmented Lagrangian methods for linearly constrained convex optimization

Xin He , Nan-Jing Hang , Yi-Bin Xiao , Ya-Ping Fang This is my paper

Pith reviewed 2026-05-20 04:46 UTC · model grok-4.3

classification 🧮 math.OC

keywords augmented Lagrangian methodNesterov extrapolationconvergence rateslinear constraintsconvex optimizationprimal-dual algorithmfeasibility violationobjective residual

0 comments

The pith

Accelerated augmented Lagrangian methods converge to saddle points with o(1/k^2) rates for feasibility violation and objective residual when parameters avoid the critical regime.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a family of accelerated augmented Lagrangian methods that incorporate Nesterov extrapolation parameters for linearly constrained convex optimization problems with differentiable objectives. It proves that the primal-dual sequence generated by these schemes converges to a saddle point while delivering accelerated rates on the augmented Lagrangian gap, the feasibility violation, and the objective residual. In the noncritical parameter regime the rates strengthen from the conventional O(1/k^2) to the stricter little-o(1/k^2) decay. The authors note that little-o rates simultaneously covering both feasibility and objective residual have not appeared before for accelerated augmented Lagrangian methods in this setting.

Core claim

Under suitable parameter conditions the implicit-gradient and partially explicit variants ensure convergence of the primal-dual sequence to a saddle point, while the augmented Lagrangian gap, feasibility violation, and objective residual each satisfy accelerated estimates that improve to little-o(1/k^2) once the Nesterov extrapolation parameters lie outside the critical regime.

What carries the argument

Nesterov extrapolation parameters embedded in the augmented Lagrangian iteration that can be chosen in the noncritical regime to secure both iterate convergence and the little-o rate improvement.

If this is right

The primal-dual sequence converges to a saddle point of the constrained problem.
The augmented Lagrangian gap, feasibility violation, and objective residual each enjoy accelerated convergence estimates.
In the noncritical regime these three quantities improve from O(1/k^2) to little-o(1/k^2).
The results hold for both continuously differentiable convex objectives (implicit-gradient variant) and smooth convex objectives (partially explicit variant).

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same noncritical tuning strategy may transfer to other inertial primal-dual schemes that rely on extrapolation parameters.
Explicit characterization of the noncritical interval could simplify parameter selection in software implementations.
Stochastic or distributed versions of the method might inherit the little-o feasibility and residual rates under analogous parameter restrictions.

Load-bearing premise

Suitable non-critical choices of the Nesterov extrapolation parameters exist that simultaneously guarantee convergence and produce the improved little-o rates.

What would settle it

A concrete linearly constrained convex problem together with a sequence of noncritical parameters for which the feasibility violation or objective residual fails to decay faster than C/k^2 for any fixed C.

read the original abstract

We consider a linearly constrained convex optimization problem with a differentiable objective function. Motivated by an inertial primal-dual dynamical system with vanishing damping, we propose a class of accelerated augmented Lagrangian methods with Nesterov extrapolation parameters. The framework contains two variants: an implicit-gradient scheme for convex continuously differentiable objectives and a partially explicit scheme for convex smooth objectives. Under suitable parameter conditions, we prove convergence of the primal-dual sequence to a saddle point, together with accelerated estimates for the augmented Lagrangian gap, the feasibility violation, and the objective residual. In the noncritical parameter regime, these estimates are improved from $O(1/k^2)$ to $o(1/k^2)$. To the best of our knowledge, such little-o rates for both feasibility violation and objective residual have not been previously established for accelerated augmented Lagrangian methods in this setting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gets little-o rates on feasibility and objective residual for accelerated AL methods, but the non-critical parameter choices are the part that needs the most scrutiny.

read the letter

The main point is that this work claims little-o(1/k^2) rates for feasibility violation and objective residual in accelerated augmented Lagrangian methods, on top of standard convergence results. That distinction from prior O(1/k^2) analyses is what they flag as new for this class of schemes. They build the methods from an inertial primal-dual dynamical system with vanishing damping and present both an implicit-gradient version for C1 convex objectives and a partially explicit version for smooth ones. Under suitable conditions they show the primal-dual sequence converges to a saddle point and recover accelerated bounds on the augmented Lagrangian gap, feasibility, and objective residual, with the little-o improvement in the non-critical regime. To their knowledge those little-o rates on feasibility and objective error had not been shown before for accelerated AL methods in this setting. That is a concrete step forward if the proofs hold. The dynamical-system link is used cleanly to motivate the discrete schemes and separate the rate regimes. The critical versus non-critical split for the Nesterov parameters is a reasonable organizing device that lets them recover the usual rates when needed and push further when the parameters stay away from the boundary case. The soft spot is exactly the one the stress-test note flags. The improved rates rest on the existence of Nesterov extrapolation parameters that keep the scheme in the non-critical regime while still guaranteeing convergence for general differentiable convex objectives. The abstract only invokes this through phrases like “under suitable parameter conditions” and “in the noncritical parameter regime.” Without an explicit, problem-independent construction or a clear existence argument in the full text, it is difficult to see how readily these parameters can be chosen in practice. If the paper supplies a concrete recipe or proves such parameters always exist without extra assumptions on the objective, the concern shrinks; otherwise the result stays somewhat conditional. This paper is for researchers working on first-order methods and rate analysis for linearly constrained convex problems. Readers who follow continuous-time perspectives on discrete optimization algorithms will find the framework useful. It has enough structure and a specific claim to deserve a serious referee, though the parameter selection will almost certainly need clarification in revision. I would send it to peer review rather than desk reject.

Referee Report

1 major / 1 minor

Summary. The manuscript proposes a class of accelerated augmented Lagrangian methods with Nesterov extrapolation parameters for linearly constrained convex optimization problems with differentiable objectives. Motivated by an inertial primal-dual dynamical system with vanishing damping, the framework includes an implicit-gradient scheme for convex C1 objectives and a partially explicit scheme for smooth convex objectives. Under suitable parameter conditions, the authors prove convergence of the primal-dual sequence to a saddle point along with accelerated estimates for the augmented Lagrangian gap, feasibility violation, and objective residual; in the noncritical parameter regime these estimates improve from O(1/k²) to o(1/k²).

Significance. If the central claims hold, the work would be a solid contribution to the analysis of accelerated first-order methods for constrained convex problems. The explicit link to a continuous inertial dynamical system supplies useful intuition, and the reported little-o rates for feasibility violation and objective residual would be novel in this setting. The distinction between critical and noncritical regimes for the extrapolation parameters adds technical nuance that could inform future parameter tuning.

major comments (1)

[Abstract and §4] Abstract and statements of the main convergence theorems (e.g., Theorems 3.1–3.3 and the rate results in §4): the improved o(1/k²) claims for the augmented Lagrangian gap, feasibility violation, and objective residual rest on the existence of Nesterov extrapolation parameters that simultaneously place the scheme in the noncritical regime and guarantee primal-dual convergence. The manuscript invokes this only through the phrases “under suitable parameter conditions” and “in the noncritical parameter regime” without supplying an explicit, problem-independent construction or a general existence argument that works for arbitrary convex differentiable objectives. This is load-bearing for the central rate-improvement claim.

minor comments (1)

[§2] The notation for the augmented Lagrangian gap and the precise definition of the critical versus noncritical regimes should be introduced with a short self-contained paragraph in §2 before the algorithm statements, to improve readability for readers outside the immediate subfield.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and the positive evaluation of our manuscript. We address the major comment below.

read point-by-point responses

Referee: [Abstract and §4] Abstract and statements of the main convergence theorems (e.g., Theorems 3.1–3.3 and the rate results in §4): the improved o(1/k²) claims for the augmented Lagrangian gap, feasibility violation, and objective residual rest on the existence of Nesterov extrapolation parameters that simultaneously place the scheme in the noncritical regime and guarantee primal-dual convergence. The manuscript invokes this only through the phrases “under suitable parameter conditions” and “in the noncritical parameter regime” without supplying an explicit, problem-independent construction or a general existence argument that works for arbitrary convex differentiable objectives. This is load-bearing for the central rate-improvement claim.

Authors: We agree that providing an explicit construction of such parameters would clarify the presentation and strengthen the central claims. The noncritical regime is defined in the manuscript by a strict inequality on the extrapolation parameters relative to the step-size sequence (see, e.g., the condition (3.5) in the paper). This condition can be satisfied independently of the specific problem data by selecting the Nesterov-type extrapolation parameters with a sufficiently slow decay, for instance by taking α_k = 1 - c/k^β with β ∈ (0,1) and c small enough. We will add a new remark or subsection in the revised manuscript that explicitly constructs such a sequence and verifies that it satisfies both the noncritical regime and the assumptions required for the convergence theorems. This construction is problem-independent and works for any convex differentiable objective satisfying the standing assumptions. revision: yes

Circularity Check

0 steps flagged

No significant circularity in convergence and rate proofs

full rationale

The paper proposes accelerated augmented Lagrangian methods motivated by an inertial dynamical system, then proves primal-dual convergence to a saddle point and accelerated estimates (O(1/k^2) or o(1/k^2) in the noncritical regime) under stated parameter conditions on the Nesterov extrapolation parameters. These results rest on direct analysis of the discrete schemes and the continuous system rather than on any quantity defined in terms of itself, any fitted parameter renamed as a prediction, or any load-bearing self-citation chain. No equations or claims reduce by construction to their inputs; the derivation is self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard domain assumptions of convexity and differentiability together with the existence of suitable extrapolation parameters; no new entities are postulated.

free parameters (1)

Nesterov extrapolation parameters
The methods rely on specific choices of these parameters to obtain both convergence and the improved little-o rates; the abstract refers to them as 'suitable parameter conditions' without explicit fitting procedure.

axioms (1)

domain assumption Objective function is convex and continuously differentiable (or smooth)
Invoked to define the problem class and to justify the implicit and partially explicit schemes.

pith-pipeline@v0.9.0 · 5681 in / 1302 out tokens · 45519 ms · 2026-05-20T04:46:35.554917+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Under suitable parameter conditions, we prove convergence of the primal-dual sequence to a saddle point, together with accelerated estimates for the augmented Lagrangian gap, the feasibility violation, and the objective residual. In the noncritical parameter regime, these estimates are improved from O(1/k²) to o(1/k²).
IndisputableMonolith/Foundation/ArrowOfTime.lean forward_accumulates unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Motivated by an inertial primal-dual dynamical system with vanishing damping

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 1 internal anchor

[1]

Adil, D., Kyng, R., Peng, R., Sachdeva, S.: Fast algorithms forℓp-regression. J. ACM 71(5), 1–45 (2024)

work page 2024
[2]

Attouch, H., Boţ, R.I., Csetnek, E.R.: Fast optimization via inertial dynamics with closed-loop damping. J. Eur. Math. Soc. 25, 1985–2056 (2023)

work page 1985
[3]

Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than1/k2. SIAM J. Optim. 26, 1824–1834 (2016)

work page 2016
[4]

Attouch, H., Chbani, Z., Riahi, H.: Fast proximal methods via time scaling of damped inertial dynamics. SIAM J. Optim., 29, 2227-2256 (2019)

work page 2019
[5]

Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. 168, 123–175 (2018)

work page 2018
[6]

28, 849–874 (2018)

Attouch, H., Cabot, A.: Convergenceratesofinertialforward-backwardalgorithms.SIAMJ.Optim. 28, 849–874 (2018)

work page 2018
[7]

Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: Fast convergence of dynamical ADMM via time scaling of damped inertial dynamics. J. Optim. Theory Appl. 193, 704–736 (2022)

work page 2022
[8]

Bai, J., Chen, Y., Dai, Y.H., Liu, Y.J.: A faster proximal-indefinite augmented Lagrangian method withO(1/k 2)convergence rate. J. Comput. Math. (to appear, 2026)

work page 2026
[9]

Bai, J., Jia, L., Peng, Z.: A new insight on augmented Lagrangian method with applications in machine learning. J. Sci. Comput. 99, 53 (2024)

work page 2024
[10]

SIAM, Philadelphia (2017)

Beck, A.:First-Order Methods in Optimization. SIAM, Philadelphia (2017)

work page 2017
[11]

Athena Scientific, Belmont (2016)

Bertsekas, D.P.:Nonlinear Programming, 3rd edn. Athena Scientific, Belmont (2016)

work page 2016
[12]

Boţ, R.I., Nguyen, D.K.: Improved convergence rates and trajectory convergence for primal-dual dynamical systems with vanishing damping. J. Differential Equations 303, 369–406 (2021)

work page 2021
[13]

Boţ, R.I., Csetnek, E.R., Nguyen, D.K.: Fast augmented Lagrangian method in the convex regime with convergence guarantees for the iterates. Math. Program. 200, 147–197 (2023)

work page 2023
[14]

The iterates of Nesterov's accelerated algorithm converge in the critical regimes , year =

Boţ, R.I., Fadili, J., Nguyen, D.K.: The iterates of Nesterov’s accelerated algorithm converge in the critical regimes. arXiv preprint arXiv:2510.22715 (2025)

work page arXiv 2025
[15]

Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 2, 1–122 (2011)

work page 2011
[16]

Chambolle, A., Dossal, C.: On the convergence of the iterates of the fast iterative shrinkage/thresh- olding algorithm. J. Optim. Theory Appl. 166, 968–982 (2016)

work page 2016
[17]

arXiv preprint arXiv:2109.11537 (2021)

Ghadiri, M., Peng, R., Vempala, S.S.: Fasterp-norm regression using sparsity. arXiv preprint arXiv:2109.11537 (2021)

work page arXiv 2021
[18]

He, B., Yuan, X.: On the acceleration of augmented Lagrangian method for linearly constrained optimization. Optim. Online (2010) 28

work page 2010
[19]

He, X., Hu, R., Fang, Y.P.: Convergence rates of inertial primal-dual dynamical methods for separable convex optimization problems. SIAM J. Control Optim. 59, 3278–3301 (2021)

work page 2021
[20]

He, X., Hu, R., Fang, Y.P.: Inertial accelerated primal-dual methods for linear equality constrained convex optimization problems. Numer. Algorithms, 90(4), 1669-1690 (2022)

work page 2022
[21]

Automatica 146, 110547 (2022)

He, X., Hu, R., Fang, Y.P.: Fast primal-dual algorithm via dynamical system for a linearly con- strained convex optimization problem. Automatica 146, 110547 (2022)

work page 2022
[22]

Trajectory convergence and $o(t^{-2})$ rates for Nesterov accelerated primal-dual dynamics without Lipschitz gradient assumption

He, X., Huang, N.J., Y.B. Xiao, Fang, Y.P.: Trajectory convergence ando(t −2)rates for Nes- terov accelerated primal-dual dynamics without Lipschitz gradient assumption. arXiv preprint arXiv:2605.18236 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[23]

He, X.: Accelerated primal-dual methods with adaptive parameters for composite convex optimiza- tion with linear constraints. Appl. Numer. Math. 203, 129–143 (2024)

work page 2024
[24]

He, X., Huang, N.J., Fang, Y.P.: Non-ergodic convergence rate of an inertial accelerated primal-dual algorithm for saddle point problems. Commun. Nonlinear Sci. Numer. Simul. 140, 108289 (2025)

work page 2025
[25]

He, X., Huang, N.J., Fang, Y.P.: Accelerated linearized alternating direction method of multipliers with Nesterov extrapolation. Commun. Nonlinear Sci. Numer. Simul., 109818 (2026)

work page 2026
[26]

Com- put

He, X., Fang, Y.P.: Accelerated forward-backward algorithms with subgradient corrections. Com- put. Optim. Appl. 93, 121–156 (2026)

work page 2026
[27]

Huang, B., Ma, S., Goldfarb, D.: Accelerated linearized Bregman method. J. Sci. Comput. 54, 428–453 (2013)

work page 2013
[28]

Point convergence of nesterov’s accelerated gradient method: An ai-assisted proof.arXiv preprint arXiv:2510.23513, 2025

Jang, U., Ryu, E.K.: Point convergence of Nesterov’s accelerated gradient method: An AI-assisted proof. arXiv preprint arXiv:2510.23513 (2025)

work page arXiv 2025
[29]

Kang, M., Yun, S., Woo, H., Kang, M.: Accelerated Bregman method for linearly constrainedℓ1-ℓ2 minimization. J. Sci. Comput. 56, 515–534 (2013)

work page 2013
[30]

Springer, Singapore (2020)

Lin, Z., Li, H., Fang, C.:Accelerated Optimization for Machine Learning. Springer, Singapore (2020)

work page 2020
[31]

Luo, H.: A universal accelerated primal-dual method for convex optimization problems. J. Optim. Theory Appl. 201(1), 280–312 (2024)

work page 2024
[32]

Luo, H., Chen, L.: From differential equation solvers to accelerated first-order methods for convex optimization. Math. Program. 195, 735–781 (2022)

work page 2022
[33]

Luo, H., Zhang, Z.: A unified differential equation solver approach for separable convex optimiza- tion: splitting, acceleration and nonergodic rate. Math. Comp. 94, 3009–3041 (2025)

work page 2025
[34]

Soviet Math

Nesterov, Y: A method for solving the convex programming problem with convergence rateO(1/k2). Soviet Math. Dokl. 27, 372–376 (1983)

work page 1983
[35]

Springer, New York (2004)

Nesterov, Y.:Introductory Lectures on Convex Optimization. Springer, New York (2004)

work page 2004
[36]

32, 204–227 (2022)

Sabach, S., Teboulle, M.: FasterLagrangian-basedmethodsinconvexoptimization.SIAMJ.Optim. 32, 204–227 (2022)

work page 2022
[37]

Shen, J., Mousavi, S.: Least sparsity ofp-norm based optimization problems withp >1. SIAM J. Optim. 28, 2721–2751 (2018) 29

work page 2018
[38]

Su, W., Boyd, S., Candès, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–43 (2016)

work page 2016
[39]

Tang, T., Toh, K.C.: Self-adaptive ADMM for semi-strongly convex problems. Math. Program. Comput. 16, 113–150 (2024)

work page 2024
[40]

Tao, M., Yuan, X.: Accelerated Uzawa methods for convex optimization. Math. Comp. 86, 1821– 1845 (2017)

work page 2017
[41]

Tran-Dinh, Q., Zhu, Y.: Non-stationary first-order primal-dual algorithms with fast convergence rates. SIAM J. Optim. 30, 2866–2896 (2020)

work page 2020
[42]

Wibisono, A., Wilson, A.C., Jordan, M.I.: A variational perspective on accelerated methods in optimization. Proc. Natl. Acad. Sci. USA 113, E7351–E7358 (2016)

work page 2016
[43]

In: Proceedings of the International Conference on Machine Learning, pp

Woodruff, D., Yasuda, T.: Sharper bounds forℓ p sensitivity sampling. In: Proceedings of the International Conference on Machine Learning, pp. 37238–37272. PMLR (2023)

work page 2023
[44]

Xie, Z., Yin, W., Wen, Z.: ODE-Based Learning to Optimize. Math. Program. (2025) DOI:10.1007/s10107-025-02303-3

work page doi:10.1007/s10107-025-02303-3 2025
[45]

Xu, Y.: Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming. SIAM J. Optim. 27, 1459–1484 (2017)

work page 2017
[46]

Zeng, X., Yi, P., Hong, Y., Xie, L.: Distributed continuous-time algorithms for nonsmooth extended monotropic optimization problems. SIAM J. Control Optim. 56, 3973–3993 (2018)

work page 2018
[47]

IEEE Trans

Zeng, X., Lei, J., Chen, J.: Dynamical primal-dual Nesterov accelerated method and its application to network optimization. IEEE Trans. Automat. Control 68, 1760–1775 (2023)

work page 2023
[48]

Zhao, Y., Liao, X., He, X., Zhou, M., Li, C.: Acceleratedprimal-dualmirrordynamicsforcentralized and distributed constrained convex optimization problems. J. Mach. Learn. Res. 24, 1–59 (2023) 30

work page 2023

[1] [1]

Adil, D., Kyng, R., Peng, R., Sachdeva, S.: Fast algorithms forℓp-regression. J. ACM 71(5), 1–45 (2024)

work page 2024

[2] [2]

Attouch, H., Boţ, R.I., Csetnek, E.R.: Fast optimization via inertial dynamics with closed-loop damping. J. Eur. Math. Soc. 25, 1985–2056 (2023)

work page 1985

[3] [3]

Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than1/k2. SIAM J. Optim. 26, 1824–1834 (2016)

work page 2016

[4] [4]

Attouch, H., Chbani, Z., Riahi, H.: Fast proximal methods via time scaling of damped inertial dynamics. SIAM J. Optim., 29, 2227-2256 (2019)

work page 2019

[5] [5]

Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. 168, 123–175 (2018)

work page 2018

[6] [6]

28, 849–874 (2018)

Attouch, H., Cabot, A.: Convergenceratesofinertialforward-backwardalgorithms.SIAMJ.Optim. 28, 849–874 (2018)

work page 2018

[7] [7]

Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: Fast convergence of dynamical ADMM via time scaling of damped inertial dynamics. J. Optim. Theory Appl. 193, 704–736 (2022)

work page 2022

[8] [8]

Bai, J., Chen, Y., Dai, Y.H., Liu, Y.J.: A faster proximal-indefinite augmented Lagrangian method withO(1/k 2)convergence rate. J. Comput. Math. (to appear, 2026)

work page 2026

[9] [9]

Bai, J., Jia, L., Peng, Z.: A new insight on augmented Lagrangian method with applications in machine learning. J. Sci. Comput. 99, 53 (2024)

work page 2024

[10] [10]

SIAM, Philadelphia (2017)

Beck, A.:First-Order Methods in Optimization. SIAM, Philadelphia (2017)

work page 2017

[11] [11]

Athena Scientific, Belmont (2016)

Bertsekas, D.P.:Nonlinear Programming, 3rd edn. Athena Scientific, Belmont (2016)

work page 2016

[12] [12]

Boţ, R.I., Nguyen, D.K.: Improved convergence rates and trajectory convergence for primal-dual dynamical systems with vanishing damping. J. Differential Equations 303, 369–406 (2021)

work page 2021

[13] [13]

Boţ, R.I., Csetnek, E.R., Nguyen, D.K.: Fast augmented Lagrangian method in the convex regime with convergence guarantees for the iterates. Math. Program. 200, 147–197 (2023)

work page 2023

[14] [14]

The iterates of Nesterov's accelerated algorithm converge in the critical regimes , year =

Boţ, R.I., Fadili, J., Nguyen, D.K.: The iterates of Nesterov’s accelerated algorithm converge in the critical regimes. arXiv preprint arXiv:2510.22715 (2025)

work page arXiv 2025

[15] [15]

Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 2, 1–122 (2011)

work page 2011

[16] [16]

Chambolle, A., Dossal, C.: On the convergence of the iterates of the fast iterative shrinkage/thresh- olding algorithm. J. Optim. Theory Appl. 166, 968–982 (2016)

work page 2016

[17] [17]

arXiv preprint arXiv:2109.11537 (2021)

Ghadiri, M., Peng, R., Vempala, S.S.: Fasterp-norm regression using sparsity. arXiv preprint arXiv:2109.11537 (2021)

work page arXiv 2021

[18] [18]

He, B., Yuan, X.: On the acceleration of augmented Lagrangian method for linearly constrained optimization. Optim. Online (2010) 28

work page 2010

[19] [19]

He, X., Hu, R., Fang, Y.P.: Convergence rates of inertial primal-dual dynamical methods for separable convex optimization problems. SIAM J. Control Optim. 59, 3278–3301 (2021)

work page 2021

[20] [20]

He, X., Hu, R., Fang, Y.P.: Inertial accelerated primal-dual methods for linear equality constrained convex optimization problems. Numer. Algorithms, 90(4), 1669-1690 (2022)

work page 2022

[21] [21]

Automatica 146, 110547 (2022)

He, X., Hu, R., Fang, Y.P.: Fast primal-dual algorithm via dynamical system for a linearly con- strained convex optimization problem. Automatica 146, 110547 (2022)

work page 2022

[22] [22]

Trajectory convergence and $o(t^{-2})$ rates for Nesterov accelerated primal-dual dynamics without Lipschitz gradient assumption

He, X., Huang, N.J., Y.B. Xiao, Fang, Y.P.: Trajectory convergence ando(t −2)rates for Nes- terov accelerated primal-dual dynamics without Lipschitz gradient assumption. arXiv preprint arXiv:2605.18236 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[23] [23]

He, X.: Accelerated primal-dual methods with adaptive parameters for composite convex optimiza- tion with linear constraints. Appl. Numer. Math. 203, 129–143 (2024)

work page 2024

[24] [24]

He, X., Huang, N.J., Fang, Y.P.: Non-ergodic convergence rate of an inertial accelerated primal-dual algorithm for saddle point problems. Commun. Nonlinear Sci. Numer. Simul. 140, 108289 (2025)

work page 2025

[25] [25]

He, X., Huang, N.J., Fang, Y.P.: Accelerated linearized alternating direction method of multipliers with Nesterov extrapolation. Commun. Nonlinear Sci. Numer. Simul., 109818 (2026)

work page 2026

[26] [26]

Com- put

He, X., Fang, Y.P.: Accelerated forward-backward algorithms with subgradient corrections. Com- put. Optim. Appl. 93, 121–156 (2026)

work page 2026

[27] [27]

Huang, B., Ma, S., Goldfarb, D.: Accelerated linearized Bregman method. J. Sci. Comput. 54, 428–453 (2013)

work page 2013

[28] [28]

Point convergence of nesterov’s accelerated gradient method: An ai-assisted proof.arXiv preprint arXiv:2510.23513, 2025

Jang, U., Ryu, E.K.: Point convergence of Nesterov’s accelerated gradient method: An AI-assisted proof. arXiv preprint arXiv:2510.23513 (2025)

work page arXiv 2025

[29] [29]

Kang, M., Yun, S., Woo, H., Kang, M.: Accelerated Bregman method for linearly constrainedℓ1-ℓ2 minimization. J. Sci. Comput. 56, 515–534 (2013)

work page 2013

[30] [30]

Springer, Singapore (2020)

Lin, Z., Li, H., Fang, C.:Accelerated Optimization for Machine Learning. Springer, Singapore (2020)

work page 2020

[31] [31]

Luo, H.: A universal accelerated primal-dual method for convex optimization problems. J. Optim. Theory Appl. 201(1), 280–312 (2024)

work page 2024

[32] [32]

Luo, H., Chen, L.: From differential equation solvers to accelerated first-order methods for convex optimization. Math. Program. 195, 735–781 (2022)

work page 2022

[33] [33]

Luo, H., Zhang, Z.: A unified differential equation solver approach for separable convex optimiza- tion: splitting, acceleration and nonergodic rate. Math. Comp. 94, 3009–3041 (2025)

work page 2025

[34] [34]

Soviet Math

Nesterov, Y: A method for solving the convex programming problem with convergence rateO(1/k2). Soviet Math. Dokl. 27, 372–376 (1983)

work page 1983

[35] [35]

Springer, New York (2004)

Nesterov, Y.:Introductory Lectures on Convex Optimization. Springer, New York (2004)

work page 2004

[36] [36]

32, 204–227 (2022)

Sabach, S., Teboulle, M.: FasterLagrangian-basedmethodsinconvexoptimization.SIAMJ.Optim. 32, 204–227 (2022)

work page 2022

[37] [37]

Shen, J., Mousavi, S.: Least sparsity ofp-norm based optimization problems withp >1. SIAM J. Optim. 28, 2721–2751 (2018) 29

work page 2018

[38] [38]

Su, W., Boyd, S., Candès, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–43 (2016)

work page 2016

[39] [39]

Tang, T., Toh, K.C.: Self-adaptive ADMM for semi-strongly convex problems. Math. Program. Comput. 16, 113–150 (2024)

work page 2024

[40] [40]

Tao, M., Yuan, X.: Accelerated Uzawa methods for convex optimization. Math. Comp. 86, 1821– 1845 (2017)

work page 2017

[41] [41]

Tran-Dinh, Q., Zhu, Y.: Non-stationary first-order primal-dual algorithms with fast convergence rates. SIAM J. Optim. 30, 2866–2896 (2020)

work page 2020

[42] [42]

Wibisono, A., Wilson, A.C., Jordan, M.I.: A variational perspective on accelerated methods in optimization. Proc. Natl. Acad. Sci. USA 113, E7351–E7358 (2016)

work page 2016

[43] [43]

In: Proceedings of the International Conference on Machine Learning, pp

Woodruff, D., Yasuda, T.: Sharper bounds forℓ p sensitivity sampling. In: Proceedings of the International Conference on Machine Learning, pp. 37238–37272. PMLR (2023)

work page 2023

[44] [44]

Xie, Z., Yin, W., Wen, Z.: ODE-Based Learning to Optimize. Math. Program. (2025) DOI:10.1007/s10107-025-02303-3

work page doi:10.1007/s10107-025-02303-3 2025

[45] [45]

Xu, Y.: Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming. SIAM J. Optim. 27, 1459–1484 (2017)

work page 2017

[46] [46]

Zeng, X., Yi, P., Hong, Y., Xie, L.: Distributed continuous-time algorithms for nonsmooth extended monotropic optimization problems. SIAM J. Control Optim. 56, 3973–3993 (2018)

work page 2018

[47] [47]

IEEE Trans

Zeng, X., Lei, J., Chen, J.: Dynamical primal-dual Nesterov accelerated method and its application to network optimization. IEEE Trans. Automat. Control 68, 1760–1775 (2023)

work page 2023

[48] [48]

Zhao, Y., Liao, X., He, X., Zhou, M., Li, C.: Acceleratedprimal-dualmirrordynamicsforcentralized and distributed constrained convex optimization problems. J. Mach. Learn. Res. 24, 1–59 (2023) 30

work page 2023