Trajectory convergence and o(t⁻²) rates for Nesterov accelerated primal-dual dynamics without Lipschitz gradient assumption
Pith reviewed 2026-05-20 09:16 UTC · model grok-4.3
The pith
Nesterov accelerated primal-dual dynamics converge to a primal-dual solution for α ≥ 3 without Lipschitz gradient assumption
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that in finite-dimensional spaces the solution trajectories of the given second-order primal-dual system converge to a point satisfying the first-order optimality conditions for every α ≥ 3, relying only on convexity and continuous differentiability of f. The proof uses a Lyapunov function involving Bregman distances to show that the distance to the solution set decreases in a controlled way. For α > 3 the same Lyapunov function implies that both the objective gap and the constraint violation are little-o of one over t squared.
What carries the argument
Bregman distance arguments in finite-dimensional Euclidean space that substitute for the missing Lipschitz gradient estimates
If this is right
- The trajectory converges even at the critical value α = 3.
- o(t^{-2}) rates hold for both objective residual and feasibility violation when α > 3.
- The strategy extends to time-scaled primal-dual dynamics.
- The result applies to convex problems with continuously differentiable but non-Lipschitz gradients.
Where Pith is reading between the lines
- Numerical simulations on non-Lipschitz gradient functions could verify the rates in practice.
- The finite-dimensional requirement indicates that additional tools would be needed for infinite-dimensional extensions.
- Similar acceleration techniques might now be analyzed for a broader set of convex programs without the Lipschitz restriction.
Load-bearing premise
The underlying vector space must be finite-dimensional to enable the Bregman-distance technique that bypasses the Lipschitz gradient requirement.
What would settle it
A counterexample consisting of a convex continuously differentiable function with non-Lipschitz gradient in an infinite-dimensional Hilbert space for which the primal-dual trajectory does not converge would disprove the necessity of restricting to finite dimensions.
read the original abstract
We consider the Nesterov accelerated primal-dual dynamical system \[ \begin{cases} \ddot{x}(t)+\dfrac{\alpha}{t}\dot{x}(t) +\nabla f(x(t)) +A^\top\bigl(\lambda(t)+\theta t\dot{\lambda}(t)\bigr)+\beta A^\top(Ax(t)-b)=0,\\[0.6em] \ddot{\lambda}(t)+\dfrac{\alpha}{t}\dot{\lambda}(t) -\bigl(A(x(t)+\theta t\dot{x}(t))-b\bigr)=0, \end{cases} \] which is linked to the linearly constrained optimization problem $ \min_{x\in\mathbb{R}^n} f(x),\ s.t.\ Ax=b, $ where $\alpha\ge 3$ and $f$ is convex and continuously differentiable. In a Hilbert framework, the weak convergence of its trajectory was established by Bo\c{t} and Nguyen (J. Differential Equations, 303:369--406, 2021) under $\alpha>3$ and the Lipschitz continuity assumption on $\nabla f$. In this paper, we prove in finite-dimensional spaces that the trajectory converges to a primal-dual solution for $\alpha\ge3$, without assuming Lipschitz continuity of $\nabla f$. Moreover, when $\alpha>3$, we establish improved $o(t^{-2})$ convergence rates for both the objective residual and the feasibility violation. Our analysis relies on Bregman-distance arguments, instead of the Lipschitz continuity of $\nabla f$. The same strategy can also be extended to time-scaled primal-dual dynamics to obtain analogous convergence results. To the best of our knowledge, this is the first results in this topic without Lipschitz gradient assumption. Our result also present the first work on the convergence of the trajectory of the accelerated primal-dual dynamical system for the critical case $\alpha=3$.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. This manuscript analyzes the convergence of trajectories for a Nesterov-accelerated primal-dual dynamical system associated with linearly constrained convex optimization problems. In finite-dimensional spaces, it establishes convergence to a primal-dual solution for damping parameter α ≥ 3 without assuming Lipschitz continuity of ∇f, relying on Bregman-distance arguments. For α > 3, o(t^{-2}) rates are derived for the objective residual and feasibility gap. The analysis is also extended to time-scaled variants.
Significance. If the results hold, the work is significant because it removes the standard Lipschitz-gradient assumption that limits many continuous-time optimization analyses, thereby broadening applicability to convex problems with non-Lipschitz gradients. The treatment of the critical case α = 3 and the successful use of Bregman distances in finite dimensions constitute a clear technical advance over the cited Hilbert-space results of Boţ and Nguyen.
major comments (1)
- [Well-posedness / existence analysis (likely early in the proofs section)] The global existence of solutions on [0, ∞) is load-bearing for the central convergence claim. Local existence follows from Peano’s theorem since the right-hand side is merely continuous, but extension to the whole half-line requires an a priori bound preventing finite-time blow-up of ||x(t)|| + ||λ(t)||. It is unclear whether this bound is obtained independently of the Bregman-distance Lyapunov function constructed later for convergence and rates; any circular dependence would invalidate the statement that “the trajectory converges.” Please identify the precise section or lemma that establishes the bound without reference to the maximal existence interval.
minor comments (2)
- [Statement of main theorems] Clarify whether the o(t^{-2}) notation denotes little-o or big-O and whether the hidden constant depends on initial data or problem parameters.
- [Introduction] The abstract and introduction cite Boţ and Nguyen (2021) but do not explicitly contrast the new finite-dimensional Bregman argument with the Lipschitz-based Lyapunov function used in that reference; a short comparative paragraph would help readers.
Simulated Author's Rebuttal
We thank the referee for the careful reading of our manuscript and for the constructive comment on the well-posedness analysis. We address the point below.
read point-by-point responses
-
Referee: The global existence of solutions on [0, ∞) is load-bearing for the central convergence claim. Local existence follows from Peano’s theorem since the right-hand side is merely continuous, but extension to the whole half-line requires an a priori bound preventing finite-time blow-up of ||x(t)|| + ||λ(t)||. It is unclear whether this bound is obtained independently of the Bregman-distance Lyapunov function constructed later for convergence and rates; any circular dependence would invalidate the statement that “the trajectory converges.” Please identify the precise section or lemma that establishes the bound without reference to the maximal existence interval.
Authors: The global existence is treated in Lemma 3.1, which appears before the convergence statements. In this lemma we construct the Bregman-distance Lyapunov function V(t) directly from the system equations and the convexity of f. We derive the differential inequality ˙V(t) ≤ 0 (or a suitable negative term when α > 3) that holds on any interval where a solution exists. Because V(t) is nonnegative and controls the Euclidean norms of (x(t), λ(t)) through the properties of the Bregman distance in finite dimensions, the trajectory remains bounded on the maximal interval [0, T_max). Standard ODE continuation theory then implies T_max = ∞. The construction of V and the inequality ˙V ≤ 0 rely only on the vector field and convexity; they do not invoke the limit of the trajectory or any convergence result proved later. We will add a short clarifying remark immediately after Lemma 3.1 to make the logical order and the absence of circularity explicit. revision: partial
Circularity Check
No significant circularity; derivation relies on independent mathematical tools
full rationale
The paper's central claims rest on standard local existence via Peano's theorem in finite dimensions followed by Bregman-distance Lyapunov analysis to obtain global convergence and rates for α ≥ 3 without Lipschitz assumptions. No step reduces by construction to a fitted parameter, self-citation chain, or renamed input; the finite-dimensional restriction is explicitly justified as enabling the Bregman arguments, and the derivation chain remains self-contained against external ODE and convex-analysis benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption f is convex and continuously differentiable
- domain assumption The dynamical system admits trajectories in finite-dimensional space
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our analysis relies on Bregman-distance arguments... Z +∞ t0 t Df(x*, x(t)) dt < +∞ ... trajectory converges to a primal-dual solution for α≥3
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
energy function Ez*(t) = θ² t² (Lβ(x(t),λ*)−Lβ(x*,λ(t))) + ½∥v(t)∥² + ξ/2 ∥z(t)−z*∥²
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Convergence of iterates and improved rates for accelerated augmented Lagrangian methods for linearly constrained convex optimization
Accelerated augmented Lagrangian methods for convex problems achieve convergence and o(1/k^2) rates for feasibility violation and objective residual under suitable parameters.
Reference graph
Works this paper leans on
-
[1]
Attouch, H. and Chbani, Z. and Fadili, J. and Riahi, H. , title =. J. Optim. Theory Appl. , volume =. 2022 , pages =
work page 2022
-
[2]
Attouch, H. and Chbani, Z. and Peypouquet, J. and Redont, P. , title =. Math. Program. , volume =. 2018 , pages =
work page 2018
-
[3]
Attouch, H. and Chbani, Z. and Riahi, H. , title =. SIAM J. Optim. , volume =. 2019 , pages =
work page 2019
-
[4]
Attouch, H. and Bo. Fast optimization via inertial dynamics with closed-loop damping , journal =. 2023 , pages =
work page 2023
-
[5]
Attouch, H. and Peypouquet, J. , title =. SIAM J. Optim. , volume =. 2016 , pages =
work page 2016
-
[6]
Aujol, J.-F. and Dossal, C. , title =. SIAM J. Optim. , volume =. 2015 , pages =
work page 2015
-
[7]
Alecsa, C. D. and L. Tikhonov regularization of a perturbed heavy ball system with vanishing damping , journal =. 2021 , pages =
work page 2021
-
[8]
Bai, J. and Jia, L. and Peng, Z. , title =. J. Sci. Comput. , volume =. 2024 , pages =
work page 2024
-
[9]
Bai, J. and Chen, Y. and Yu, X. and Zhang, H. , title =. J. Sci. Comput. , volume =. 2025 , pages =
work page 2025
-
[10]
Battahi, F. and Chbani, Z. and Riahi, H. , title =. Evol. Equ. Control Theory , volume =. 2025 , pages =
work page 2025
-
[11]
Bo. Improved convergence rates and trajectory convergence for primal-dual dynamical systems with vanishing damping , journal =. 2021 , pages =
work page 2021
-
[12]
Tikhonov regularization of a second order dynamical system with Hessian driven damping , journal =
Bo. Tikhonov regularization of a second order dynamical system with Hessian driven damping , journal =. 2021 , pages =
work page 2021
-
[13]
Fast optimistic gradient descent ascent method in continuous and discrete time , journal =
Bo. Fast optimistic gradient descent ascent method in continuous and discrete time , journal =. 2025 , pages =
work page 2025
-
[14]
Bo. Fast augmented Lagrangian method in the convex regime with convergence guarantees for the iterates , journal =. 2023 , pages =
work page 2023
-
[15]
A primal-dual dynamical approach to structured convex minimization problems , journal =
Bo. A primal-dual dynamical approach to structured convex minimization problems , journal =. 2020 , pages =
work page 2020
-
[16]
The iterates of Nesterov's accelerated algorithm converge in the critical regimes , year =
Bo. The iterates of Nesterov's accelerated algorithm converge in the critical regimes , year =. 2510.22715 , archivePrefix =
-
[17]
Bo. Fast second-order dynamics with slow vanishing damping approaching the zeros of a monotone and continuous operator , year =. 2407.15542 , archivePrefix =
-
[18]
Boyd, S. and Parikh, N. and Chu, E. and Peleato, B. and Eckstein, J. , title =. Found. Trends Mach. Learn. , volume =. 2010 , pages =
work page 2010
-
[19]
Guo, L. and Shi, X. and Cao, J. and Wang, Z. , title =. IEEE Trans. Neural Netw. Learn. Syst. , volume =. 2024 , pages =
work page 2024
-
[20]
Accelerated forward-backward algorithms with subgradient corrections , author=. Comput. Optim. Appl. , volume=. 2026 , publisher=
work page 2026
- [21]
- [22]
- [23]
-
[24]
He, X. and Tian, F. and Li, A. and Fang, Y.-P. , title =. Optimization , volume =. 2025 , pages =
work page 2025
-
[25]
He, X. and Guo, L. and He, D. , title =. Neural Networks , volume =. 2025 , pages =
work page 2025
-
[26]
Hulett, D. A. and Nguyen, D.-K. , title =. Appl. Math. Optim. , volume =. 2023 , pages =
work page 2023
- [27]
- [28]
- [29]
-
[30]
Luo, H. and Chen, L. , title =. Math. Program. , volume =. 2022 , pages =
work page 2022
- [31]
- [32]
- [33]
-
[34]
Nesterov, Y. E. , title =. Soviet Math. Dokl. , volume =. 1983 , pages =
work page 1983
-
[35]
Su, W. and Boyd, S. and Cand. A differential equation for modeling Nesterov's accelerated gradient method: Theory and insights , journal =. 2016 , pages =
work page 2016
-
[36]
Sun, X. and Zheng, L. and Teo, K. L. , title =. J. Optim. Theory Appl. , volume =. 2025 , pages =
work page 2025
-
[37]
Wibisono, A. and Wilson, A. C. and Jordan, M. I. , title =. Proc. Natl. Acad. Sci. USA , volume =. 2016 , pages =
work page 2016
-
[38]
Zeng, X. and Lei, J. and Chen, J. , title =. IEEE Trans. Automat. Control , volume =. 2023 , pages =
work page 2023
- [39]
-
[40]
Zhang, H. and Sun, X. and Li, S. and Teo, K. L. , title =. Appl. Math. Optim. , volume =. 2026 , pages =
work page 2026
-
[41]
Zhao, Y. and Liao, X. and He, X. and Zhou, M. and Li, C. , title =. J. Mach. Learn. Res. , volume =. 2023 , pages =
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.