Towards faster first order methods: A continuous-time model to interpolate between speed and function value restart

Huiyuan Guo; Juan Jos\'e Maul\'en; Juan Peypouquet

arxiv: 2506.11267 · v3 · submitted 2025-06-12 · 🧮 math.OC

Towards faster first order methods: A continuous-time model to interpolate between speed and function value restart

Juan Jos\'e Maul\'en , Huiyuan Guo , Juan Peypouquet This is my paper

Pith reviewed 2026-05-19 09:10 UTC · model grok-4.3

classification 🧮 math.OC

keywords restarting schemesinertial dynamicsHessian-driven dampinglinear convergenceaccelerated optimizationfirst-order methodscontinuous-time models

0 comments

The pith

A new restarting scheme for inertial dynamics with Hessian damping achieves linear convergence in function values without knowing the strong convexity parameter.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a restarting scheme for continuous inertial dynamics that include Hessian-driven damping. This scheme blends speed-based and function-value-based triggers so that restarts occur later than in prior methods. It still guarantees linear convergence of the function values along the restarted paths. The approach requires no knowledge of the strong convexity modulus and generalizes existing speed-restart techniques while keeping function-value decrease intact.

Core claim

The central claim is that a new restarting scheme for the continuous inertial dynamics with Hessian driven-damping establishes a linear convergence rate for the function values along the restarted trajectories. The routine is implemented without knowing the strong convexity parameter and generalizes existing speed restart schemes by interpolating between speed and function value restarts, delaying the restarting time while preserving convergence and function value decrease.

What carries the argument

The interpolating restarting rule applied to continuous inertial dynamics with Hessian-driven damping, which combines speed and function-value criteria to set restart times.

If this is right

Linear convergence rates hold for function values along the restarted trajectories.
Restarting times are delayed relative to pure speed or pure function-value schemes.
Convergence and function-value decrease are preserved under the new rule.
Discretized versions of the dynamics yield accelerated first-order algorithms with improved practical convergence rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The interpolation parameter could be tuned to balance restart frequency against convergence speed for specific problem classes.
The continuous-time model offers a template for designing discrete algorithms that adapt restart behavior automatically.
Similar restarting ideas might extend to other damping terms or non-smooth convex problems.

Load-bearing premise

The continuous inertial dynamics with Hessian-driven damping admit a restarting rule that interpolates speed and function-value triggers while preserving linear convergence without explicit knowledge of the strong-convexity modulus.

What would settle it

Numerical runs on a strongly convex quadratic where function values fail to decay linearly after restarts triggered at the interpolated times would disprove the linear convergence claim.

Figures

Figures reproduced from arXiv: 2506.11267 by Huiyuan Guo, Juan Jos\'e Maul\'en, Juan Peypouquet.

**Figure 2.** Figure 2: displays the values of ϕz,λ(t) and φ(xz(t)), where xz(t) is solution of (DIN-AVD), where α = 3, β ∈ {0, 1}, and the function φ is that of Example 1 with ρ = 1. One can clearly appreciate the delays the activation of the λ-extended speed restart, as λ increases. 1 [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Values along the trajectory, for the solutions of ( [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: (b) presents the evolution of φ(xk) − φ ∗ along the iterations and the coefficients of the approximation in [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: Values of φ(xk)−φ ∗ along the iterations for non restart, speed restart scheme (λ = 0), extended speed restart and the versions with the warm start. 8 Conclusion We propose a new speed restart scheme for an inertial dynamical system with Hessian-driven damping (DIN-AVD). The function values along the trajectories are proven to converge linearly when the objective function is µ-strongly convex. The extended… view at source ↗

read the original abstract

We introduce a new restarting scheme for a continuous inertial dynamics with Hessian driven-damping, and establish a linear convergence rate for the function values along the restarted trajectories. The proposed routine is implemented without knowing the strong convexity parameter, and is a generalization of existing speed restart schemes. It interpolates between speed and function value restarts, considerably delaying the restarting time, while preserving convergence and function value decrease. Numerical experiments show an improvement in the convergence rates for both continuous-time dynamical systems, and the associated accelerated first-order algorithms derived via time discretization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a restart rule for Hessian-damped inertial flows that blends speed and function-value triggers without needing the strong-convexity constant, but the supporting analysis is only sketched at high level.

read the letter

The central new piece is a restarting scheme for the continuous inertial dynamics with Hessian-driven damping that interpolates between the usual speed restart and a function-value restart. It claims linear convergence of the function values along the restarted trajectories and does so without the user supplying the strong-convexity modulus. That is a clean generalization of existing restart ideas, and the numerical plots on both the ODE and the discretized algorithm show some practical speed-up. The construction itself looks internally consistent with standard Lyapunov arguments for restarted inertial systems, and the parameter-free aspect removes a common practical obstacle. Credit is due for making the interpolation explicit and for checking that the function-value decrease is preserved. The main soft spot is that the abstract and the high-level description give no proof outline, no explicit rate constants, and no clear statement of the precise assumptions on the damping or the objective. Without those details it is hard to tell how much the interpolation actually buys versus how much is carried by the underlying continuous dynamics. The discretization step to first-order algorithms is mentioned but not dissected, so any gap between the continuous guarantee and the discrete implementation remains unexamined. This work is aimed at researchers who already follow continuous-time models for accelerated gradient methods and who care about restart heuristics. A reader looking for a new practical restart knob would find the construction useful even if the rates need tightening. The paper is coherent enough on its own terms to merit a serious referee; the idea is specific and the claims are falsifiable once the proofs are written out. I would send it to review and ask the authors to supply the missing analysis and a clearer comparison table against the two restart baselines they interpolate.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a restarting scheme for continuous inertial dynamics with Hessian-driven damping. It establishes a linear convergence rate for function values along the restarted trajectories, implements the scheme without knowledge of the strong-convexity parameter, and generalizes existing speed-restart schemes. The new rule interpolates between speed and function-value triggers, delaying restart times while preserving both convergence and function-value decrease. Numerical experiments indicate improved convergence rates for the continuous dynamics and their time-discretized first-order algorithms.

Significance. If the linear-convergence claim and parameter-free property hold under the stated assumptions, the work would supply a useful continuous-time interpolation between restart heuristics, potentially informing more adaptive accelerated methods. The generalization of speed restarts and the reported numerical gains on both continuous and discrete systems constitute concrete strengths.

major comments (2)

[§4, Theorem 4.1] §4, Theorem 4.1: the linear convergence statement for the restarted trajectories is asserted without an explicit Lyapunov function or error-bound derivation in the main text; the appendix sketch does not show how the interpolating threshold preserves the rate independently of the unknown strong-convexity modulus μ.
[§3.2, Eq. (18)] §3.2, Eq. (18): the restart condition that interpolates speed and function-value triggers is defined using a parameter α whose admissible range is not shown to be independent of μ; this leaves open whether the claimed parameter-free implementation is fully realized.

minor comments (2)

[Abstract] The abstract claims the scheme 'considerably delays' restart time, yet no quantitative bound or comparison metric is supplied in §5 or the experiments.
[Figure 3] Figure 3 caption does not specify the discretization step size or the exact first-order scheme used, hindering reproducibility of the discrete experiments.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. We address each major comment below and describe the revisions planned for the next version.

read point-by-point responses

Referee: [§4, Theorem 4.1] §4, Theorem 4.1: the linear convergence statement for the restarted trajectories is asserted without an explicit Lyapunov function or error-bound derivation in the main text; the appendix sketch does not show how the interpolating threshold preserves the rate independently of the unknown strong-convexity modulus μ.

Authors: We agree that the main text would benefit from a clearer outline of the Lyapunov analysis. In the revised version we will add a short paragraph in Section 4 that states the Lyapunov function explicitly and sketches the key error-bound steps leading to the linear rate. The appendix will be expanded with a self-contained derivation that isolates the contribution of the interpolating threshold and shows, step by step, that the contraction factor remains independent of μ. These additions will make the parameter-free character of the result fully transparent. revision: yes
Referee: [§3.2, Eq. (18)] §3.2, Eq. (18): the restart condition that interpolates speed and function-value triggers is defined using a parameter α whose admissible range is not shown to be independent of μ; this leaves open whether the claimed parameter-free implementation is fully realized.

Authors: The admissible interval for α is chosen so that the interpolation remains valid under the same assumptions used for the speed-restart case and does not require knowledge of μ. To eliminate any ambiguity we will insert a short remark immediately after Equation (18) that states the explicit bounds on α and verifies that these bounds are independent of the strong-convexity modulus. This clarification will confirm that the implementation stays parameter-free. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper introduces a restarting scheme for continuous inertial dynamics with Hessian-driven damping and derives linear convergence of function values along the restarted trajectories as an established result, rather than defining it tautologically via the restart rule itself. The scheme is presented as generalizing existing speed restarts while interpolating triggers and avoiding explicit knowledge of the strong-convexity modulus. No load-bearing steps reduce by construction to inputs, self-definitions, fitted predictions renamed as results, or chains of self-citations that substitute for independent verification. The central claims remain self-contained against the stated dynamics and standard Lyapunov-style analyses for restarted inertial systems.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on the abstract alone, the central claim rests on the existence of a well-defined interpolation rule for the restart time that preserves linear convergence for the given continuous dynamics. No explicit free parameters or invented entities are mentioned.

axioms (1)

domain assumption The continuous inertial dynamics with Hessian-driven damping admit a restart rule that interpolates speed and function-value triggers while preserving linear convergence without knowledge of the strong-convexity parameter.
This premise is required for the claimed linear rate and parameter-free implementation.

pith-pipeline@v0.9.0 · 5621 in / 1288 out tokens · 44981 ms · 2026-05-19T09:10:28.186637+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce a new restarting scheme for a continuous inertial dynamics with Hessian driven-damping... T^λ(z) = inf {t > 0 : ϕ_{z,λ}(t) ≤ 0}
IndisputableMonolith/Foundation/DimensionForcing.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1... linear convergence rate for the function values along the restarted trajectories

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

[1]

Alamo, P

T. Alamo, P. Krupa, and D. Limon. Gradient based restart F ISTA. In 2019 IEEE 58th Conference on Decision and Control (CDC) , pages 3936–3941. IEEE, 2019

work page 2019
[2]

Alamo, P

T. Alamo, P. Krupa, and D. Limon. Restart of accelerated ﬁ rst-order methods with lin- ear convergence under a quadratic functional growth condit ion. IEEE Transactions on Automatic Control, 68(1):612–619, 2022

work page 2022
[3]

Alamo, D

T. Alamo, D. Limon, and P. Krupa. Restart FISTA with globa l linear convergence. In 2019 18th European Control Conference (ECC) , pages 1969–1974. IEEE, 2019

work page 2019
[4]

´Alvarez, H

F. ´Alvarez, H. Attouch, J. Bolte, and P. Redont. A second-order gradient-like dissipative dynamical system with Hessian-driven damping: Applicatio n to optimization and mechan- ics. Journal de math´ ematiques pures et appliqu´ ees, 81(8):747–779, 2002

work page 2002
[5]

Applegate, O

D. Applegate, O. Hinder, H. Lu, and M. Lubin. Faster ﬁrst- order primal-dual methods for linear programming using restarts and sharpness. Mathematical Programming, 201(1):133– 184, 2023

work page 2023
[6]

Attouch, Z

H. Attouch, Z. Chbani, J. Fadili, and H. Riahi. First-ord er optimization algorithms via inertial systems with Hessian driven damping. Mathematical Programming, pages 1–43, 2020

work page 2020
[7]

Attouch, Z

H. Attouch, Z. Chbani, J. Peypouquet, and P. Redont. Fast convergence of inertial dy- namics and algorithms with asymptotic vanishing viscosity . Mathematical Programming, 168:123–175, 2018

work page 2018
[8]

Attouch and J

H. Attouch and J. Peypouquet. The rate of convergence of n esterov’s accelerated forward- backward method is actually faster than 1 /k2. SIAM Journal on Optimization , 26(3):1824– 1834, 2016

work page 2016
[9]

Attouch, J

H. Attouch, J. Peypouquet, and P. Redont. Fast convex opt imization via inertial dynamics with Hessian driven damping. Journal of Diﬀerential Equations , 261(10):5734–5783, 2016

work page 2016
[10]

Aujol, C

J.-F. Aujol, C. Dossal, H. Labarri` ere, and A. Rondepie rre. Fista restart using an automatic estimation of the growth parameter. Journal of Optimization Theory and Applications , 206(2):1–27, 2025

work page 2025
[11]

C. Bao, L. Chen, J. Li, and Z. Shen. Accelerated gradient methods with gradient restart: Global linear convergence. arXiv preprint arXiv:2401.07672 , 2024. 15

work page arXiv 2024
[12]

Beck and M

A. Beck and M. Teboulle. A fast iterative shrinkage-thr esholding algorithm for linear inverse problems. SIAM journal on imaging sciences , 2(1):183–202, 2009

work page 2009
[13]

Fercoq and Z

O. Fercoq and Z. Qu. Adaptive restart of accelerated gra dient methods under local quadratic growth condition. IMA Journal of Numerical Analysis , 39(4):2069–2095, 2019

work page 2069
[14]

Giselsson and S

P. Giselsson and S. Boyd. Monotonicity and restart in fa st gradient methods. In 53rd IEEE Conference on Decision and Control , pages 5058–5063. IEEE, 2014

work page 2014
[15]

H. Guo, J. J. Maul´ en, and J. Peypouquet. A speed restart scheme for a dynami- cal system with Hessian-driven damping and three constant c oeﬃcients. arXiv preprint arXiv:2412.06691, 2024

work page arXiv 2024
[16]

Li and B

B.-W. Li and B. Shi. Linear convergence of ista and FISTA . Journal of the Operations Research Society of China , 2024

work page 2024
[17]

Li and Z

H. Li and Z. Lin. Restarted nonconvex accelerated gradi ent descent: No more polylogarith- mic factor in the O(ε− 7/4) complexity. Journal of Machine Learning Research , 24(157):1– 37, 2023

work page 2023
[18]

Lin and L

Q. Lin and L. Xiao. An adaptive accelerated proximal gra dient method and its homotopy continuation for sparse optimization. In International Conference on Machine Learning , pages 73–81. PMLR, 2014

work page 2014
[19]

H. Luo, L. Tang, and X. Yang. An accelerated gradient met hod with adaptive restart for convex multiobjective optimization problems. arXiv preprint arXiv:2501.07863 , 2025

work page arXiv 2025
[20]

J. J. Maul´ en and J. Peypouquet. A speed restart scheme f or a dynamics with Hessian driven damping. Journal of Optimization Theory and Applications , 2023

work page 2023
[21]

R. May. Asymptotic for a second-order evolution equati on with convex potential andvan- ishing damping term. Turkish Journal of Mathematics , 41(3):681–685, 2017

work page 2017
[22]

Necoara, Y

I. Necoara, Y. Nesterov, and F. Glineur. Linear converg ence of ﬁrst order methods for non-strongly convex optimization. Mathematical Programming, 175(1-2):69–107, 2019

work page 2019
[23]

Nesterov

Y. Nesterov. A method for solving the convex programmin g problem with convergence rate O(1/k2). Proceedings of the USSR Academy of Sciences , 269:543–547, 1983

work page 1983
[24]

Nesterov

Y. Nesterov. Introductory Lectures on Convex Optimization: A Basic Course . Applied Optimization. Springer US, 2003

work page 2003
[25]

Nesterov

Y. Nesterov. Gradient methods for minimizing composit e functions. Mathematical pro- gramming, 140(1):125–161, 2013

work page 2013
[26]

O’Donoghue and E

B. O’Donoghue and E. Cand` es. Adaptive restart for acce lerated gradient schemes. Foun- dations of computational mathematics , 15:715–732, 2015

work page 2015
[27]

J. Park. Accelerated additive Schwarz methods for conv ex optimization with adaptive restart. Journal of Scientiﬁc Computing , 89(3):58, 2021

work page 2021
[28]

Roulet and A

V. Roulet and A. d’Aspremont. Sharpness, restart and ac celeration. Advances in Neural Information Processing Systems , 30, 2017

work page 2017
[29]

B. Shi, S. S. Du, M. I. Jordan, and W. J. Su. Understanding the acceleration phenomenon via high-resolution diﬀerential equations. Mathematical Programming, pages 1–70, 2022. 16

work page 2022
[30]

W. Su, S. Boyd, and E. J. Cand` es. A diﬀerential equation f or modeling Nesterov’s ac- celerated gradient method: Theory and insights. Journal of Machine Learning Research , 17(153):1–43, 2016

work page 2016
[31]

B. Wang, T. Nguyen, T. Sun, A. Bertozzi, R. Baraniuk, and S. Osher. Scheduled restart momentum for accelerated stochastic gradient descent. SIAM Journal on Imaging Sciences , 15(2):738–761, 2022

work page 2022
[32]

Wang and J

Z. Wang and J. Peypouquet. Accelerated gradient method s via inertial systems with Hessian-driven damping. arXiv preprint arXiv:2502.16953 , 2025. 17

work page arXiv 2025

[1] [1]

Alamo, P

T. Alamo, P. Krupa, and D. Limon. Gradient based restart F ISTA. In 2019 IEEE 58th Conference on Decision and Control (CDC) , pages 3936–3941. IEEE, 2019

work page 2019

[2] [2]

Alamo, P

T. Alamo, P. Krupa, and D. Limon. Restart of accelerated ﬁ rst-order methods with lin- ear convergence under a quadratic functional growth condit ion. IEEE Transactions on Automatic Control, 68(1):612–619, 2022

work page 2022

[3] [3]

Alamo, D

T. Alamo, D. Limon, and P. Krupa. Restart FISTA with globa l linear convergence. In 2019 18th European Control Conference (ECC) , pages 1969–1974. IEEE, 2019

work page 2019

[4] [4]

´Alvarez, H

F. ´Alvarez, H. Attouch, J. Bolte, and P. Redont. A second-order gradient-like dissipative dynamical system with Hessian-driven damping: Applicatio n to optimization and mechan- ics. Journal de math´ ematiques pures et appliqu´ ees, 81(8):747–779, 2002

work page 2002

[5] [5]

Applegate, O

D. Applegate, O. Hinder, H. Lu, and M. Lubin. Faster ﬁrst- order primal-dual methods for linear programming using restarts and sharpness. Mathematical Programming, 201(1):133– 184, 2023

work page 2023

[6] [6]

Attouch, Z

H. Attouch, Z. Chbani, J. Fadili, and H. Riahi. First-ord er optimization algorithms via inertial systems with Hessian driven damping. Mathematical Programming, pages 1–43, 2020

work page 2020

[7] [7]

Attouch, Z

H. Attouch, Z. Chbani, J. Peypouquet, and P. Redont. Fast convergence of inertial dy- namics and algorithms with asymptotic vanishing viscosity . Mathematical Programming, 168:123–175, 2018

work page 2018

[8] [8]

Attouch and J

H. Attouch and J. Peypouquet. The rate of convergence of n esterov’s accelerated forward- backward method is actually faster than 1 /k2. SIAM Journal on Optimization , 26(3):1824– 1834, 2016

work page 2016

[9] [9]

Attouch, J

H. Attouch, J. Peypouquet, and P. Redont. Fast convex opt imization via inertial dynamics with Hessian driven damping. Journal of Diﬀerential Equations , 261(10):5734–5783, 2016

work page 2016

[10] [10]

Aujol, C

J.-F. Aujol, C. Dossal, H. Labarri` ere, and A. Rondepie rre. Fista restart using an automatic estimation of the growth parameter. Journal of Optimization Theory and Applications , 206(2):1–27, 2025

work page 2025

[11] [11]

C. Bao, L. Chen, J. Li, and Z. Shen. Accelerated gradient methods with gradient restart: Global linear convergence. arXiv preprint arXiv:2401.07672 , 2024. 15

work page arXiv 2024

[12] [12]

Beck and M

A. Beck and M. Teboulle. A fast iterative shrinkage-thr esholding algorithm for linear inverse problems. SIAM journal on imaging sciences , 2(1):183–202, 2009

work page 2009

[13] [13]

Fercoq and Z

O. Fercoq and Z. Qu. Adaptive restart of accelerated gra dient methods under local quadratic growth condition. IMA Journal of Numerical Analysis , 39(4):2069–2095, 2019

work page 2069

[14] [14]

Giselsson and S

P. Giselsson and S. Boyd. Monotonicity and restart in fa st gradient methods. In 53rd IEEE Conference on Decision and Control , pages 5058–5063. IEEE, 2014

work page 2014

[15] [15]

H. Guo, J. J. Maul´ en, and J. Peypouquet. A speed restart scheme for a dynami- cal system with Hessian-driven damping and three constant c oeﬃcients. arXiv preprint arXiv:2412.06691, 2024

work page arXiv 2024

[16] [16]

Li and B

B.-W. Li and B. Shi. Linear convergence of ista and FISTA . Journal of the Operations Research Society of China , 2024

work page 2024

[17] [17]

Li and Z

H. Li and Z. Lin. Restarted nonconvex accelerated gradi ent descent: No more polylogarith- mic factor in the O(ε− 7/4) complexity. Journal of Machine Learning Research , 24(157):1– 37, 2023

work page 2023

[18] [18]

Lin and L

Q. Lin and L. Xiao. An adaptive accelerated proximal gra dient method and its homotopy continuation for sparse optimization. In International Conference on Machine Learning , pages 73–81. PMLR, 2014

work page 2014

[19] [19]

H. Luo, L. Tang, and X. Yang. An accelerated gradient met hod with adaptive restart for convex multiobjective optimization problems. arXiv preprint arXiv:2501.07863 , 2025

work page arXiv 2025

[20] [20]

J. J. Maul´ en and J. Peypouquet. A speed restart scheme f or a dynamics with Hessian driven damping. Journal of Optimization Theory and Applications , 2023

work page 2023

[21] [21]

R. May. Asymptotic for a second-order evolution equati on with convex potential andvan- ishing damping term. Turkish Journal of Mathematics , 41(3):681–685, 2017

work page 2017

[22] [22]

Necoara, Y

I. Necoara, Y. Nesterov, and F. Glineur. Linear converg ence of ﬁrst order methods for non-strongly convex optimization. Mathematical Programming, 175(1-2):69–107, 2019

work page 2019

[23] [23]

Nesterov

Y. Nesterov. A method for solving the convex programmin g problem with convergence rate O(1/k2). Proceedings of the USSR Academy of Sciences , 269:543–547, 1983

work page 1983

[24] [24]

Nesterov

Y. Nesterov. Introductory Lectures on Convex Optimization: A Basic Course . Applied Optimization. Springer US, 2003

work page 2003

[25] [25]

Nesterov

Y. Nesterov. Gradient methods for minimizing composit e functions. Mathematical pro- gramming, 140(1):125–161, 2013

work page 2013

[26] [26]

O’Donoghue and E

B. O’Donoghue and E. Cand` es. Adaptive restart for acce lerated gradient schemes. Foun- dations of computational mathematics , 15:715–732, 2015

work page 2015

[27] [27]

J. Park. Accelerated additive Schwarz methods for conv ex optimization with adaptive restart. Journal of Scientiﬁc Computing , 89(3):58, 2021

work page 2021

[28] [28]

Roulet and A

V. Roulet and A. d’Aspremont. Sharpness, restart and ac celeration. Advances in Neural Information Processing Systems , 30, 2017

work page 2017

[29] [29]

B. Shi, S. S. Du, M. I. Jordan, and W. J. Su. Understanding the acceleration phenomenon via high-resolution diﬀerential equations. Mathematical Programming, pages 1–70, 2022. 16

work page 2022

[30] [30]

W. Su, S. Boyd, and E. J. Cand` es. A diﬀerential equation f or modeling Nesterov’s ac- celerated gradient method: Theory and insights. Journal of Machine Learning Research , 17(153):1–43, 2016

work page 2016

[31] [31]

B. Wang, T. Nguyen, T. Sun, A. Bertozzi, R. Baraniuk, and S. Osher. Scheduled restart momentum for accelerated stochastic gradient descent. SIAM Journal on Imaging Sciences , 15(2):738–761, 2022

work page 2022

[32] [32]

Wang and J

Z. Wang and J. Peypouquet. Accelerated gradient method s via inertial systems with Hessian-driven damping. arXiv preprint arXiv:2502.16953 , 2025. 17

work page arXiv 2025