A Restart-Free Accelerated Algorithm for Non-Convex Minimization: Continuous and Discrete Analysis

Kansei Ushiyama; Shun Sato

arxiv: 2606.30050 · v1 · pith:65Q55CFTnew · submitted 2026-06-29 · 🧮 math.OC

A Restart-Free Accelerated Algorithm for Non-Convex Minimization: Continuous and Discrete Analysis

Kansei Ushiyama , Shun Sato This is my paper

Pith reviewed 2026-06-30 05:22 UTC · model grok-4.3

classification 🧮 math.OC

keywords nonconvex optimizationfirst-order methodsrestart-free algorithmsaccelerated convergencecontinuous-time analysisperformance estimation problemadaptive Lipschitz estimationordinary differential equations

0 comments

The pith

Two new first-order methods reach an ε-approximate stationary point for nonconvex problems in O(ε^{-7/4}) evaluations without restarts or prior knowledge of ε.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces two algorithms for minimizing nonconvex functions whose gradients and Hessians are Lipschitz continuous. Both reach an approximate first-order stationary point using O(ε^{-7/4}) function and gradient evaluations, and neither requires ε as an input parameter or any restart mechanism. The methods arise from discretizing a new continuous-time ordinary differential equation model, with convergence proved in parallel for the continuous and discrete versions inside the Performance Estimation Problem framework. The first algorithm keeps its last iterate differentiable with respect to the starting point; the second estimates the Lipschitz constants on the fly.

Core claim

By discretizing a newly proposed continuous-time ODE model, the authors obtain two restart-free first-order methods that attain the O(ε^{-7/4}) complexity for finding an ε-approximate first-order stationary point of a nonconvex function with Lipschitz gradient and Hessian, without using ε as an input parameter and without any restart mechanism. The continuous- and discrete-time analyses proceed in parallel under the Performance Estimation Problem framework. One version yields a simple implementation whose final iterate remains differentiable with respect to the initial point; the other version estimates the Lipschitz constants adaptively and therefore requires no a priori knowledge of those

What carries the argument

A newly introduced continuous-time ordinary differential equation model whose discretizations produce the algorithms and whose convergence properties transfer to the discrete setting under the Performance Estimation Problem framework.

If this is right

The first algorithm admits a simple implementation in which the last iterate is differentiable with respect to the initial point.
The second algorithm estimates the required Lipschitz constants during execution and therefore needs no a priori knowledge of them.
Both algorithms avoid restart mechanisms while matching the best known complexity for this problem class.
The continuous and discrete analyses are carried out in parallel inside the same Performance Estimation Problem framework.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Because the final iterate of the first algorithm remains differentiable, the method could be embedded inside larger differentiable pipelines such as hyperparameter optimization or end-to-end learning.
The restart-free design may reduce sensitivity to the choice of restart frequency on problems where the landscape changes across scales.
The parallel continuous-discrete analysis suggests that similar ODE models could be discretized for other nonconvex problem classes while preserving the PEP-based rate.
Adaptive estimation of Lipschitz constants without restarts may improve practical robustness on functions where the local smoothness varies.

Load-bearing premise

The convergence guarantees proved for the continuous ODE model carry over to its discretizations inside the Performance Estimation Problem framework without extra error terms that would worsen the rate.

What would settle it

A concrete nonconvex function with known Lipschitz constants on which the proposed discrete algorithm requires more than O(ε^{-7/4}) evaluations to reach an ε-stationary point, or a counter-example showing that the PEP analysis introduces a discretization error that degrades the rate below O(ε^{-7/4}).

read the original abstract

We propose two novel first-order methods for minimizing nonconvex functions with Lipschitz-continuous gradients and Hessians. These algorithms attain an $\varepsilon$-approximate first-order stationary point in $\mathrm{O}(\varepsilon^{-7/4})$ function and gradient evaluations, without using $\varepsilon$ as an input parameter. While existing methods rely on restart mechanisms to achieve this complexity, our methods do not. Consequently, the first algorithm enjoys a simple implementation, making its last iterate differentiable with respect to the initial point. By estimating the Lipschitz constants adaptively, we develop the second algorithm that does not require prior knowledge of the constants. This algorithm exhibits better numerical performance than existing parameter-free methods for certain problems, which can be attributed to its restart-free design. Both algorithms are derived by discretizing a newly introduced continuous-time model represented by an ordinary differential equation, and their continuous- and discrete-time convergence analyses proceed in a parallel manner under the Performance Estimation Problem framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives restart-free discretizations of a new ODE that reach ε-stationary points in O(ε^{-7/4}) evaluations for nonconvex smooth problems without feeding ε or the Lipschitz constants to the algorithm.

read the letter

The main contribution is a pair of first-order methods that avoid restarts while matching the O(ε^{-7/4}) complexity that restart-based schemes already achieve. They come from discretizing a newly proposed continuous-time ODE, and the authors run the convergence analysis for both the ODE and the discrete iterates in parallel under the PEP framework. The restart-free property keeps the final iterate differentiable with respect to the starting point, which is useful when the optimizer sits inside a larger differentiable pipeline. The second variant estimates the Lipschitz constants adaptively and shows better numerical behavior than other parameter-free baselines on the problems they test.

The parallel continuous-discrete PEP analysis is the cleanest part of the work; it makes the rate transfer explicit rather than relying on separate arguments. The adaptive version also removes the need to know the constants in advance, which is a practical improvement over some earlier parameter-free methods.

The soft spot is the transfer step itself. The abstract claims the discretization preserves the rate without extra error terms, but the actual ODE equation, the PEP formulation, and the discretization map are not shown here. If those pieces are tight, the claim holds; if the PEP is constructed around the target rate or if discretization introduces hidden factors, the exponent could degrade. The numerical results are reported but without enough detail on problem selection or baseline tuning to judge how general the advantage is.

This is for people working on accelerated first-order methods or PEP analysis in nonconvex settings. A reader who needs restart-free or fully adaptive schemes would find the construction useful. It deserves a serious referee because the restart-free angle is verifiable once the PEP details are checked and the numerical edge is reproducible.

Referee Report

2 major / 2 minor

Summary. The paper proposes two novel restart-free first-order methods for non-convex minimization of functions with Lipschitz gradients and Hessians. Derived by discretizing a new continuous-time ODE, the methods achieve an O(ε^{-7/4}) complexity bound for ε-approximate first-order stationary points without restarts or knowledge of ε in advance. One method is simple with a differentiable last iterate; the second adaptively estimates the Lipschitz constants. Convergence analyses for both the continuous ODE and discrete algorithms are carried out in parallel under the Performance Estimation Problem (PEP) framework.

Significance. If the central claims hold, the work is significant for providing restart-free accelerated rates in non-convex optimization, simplifying implementation compared to restart-based methods and enabling better practical performance in the adaptive variant. The parallel continuous-discrete PEP analysis is a methodological contribution, and the differentiability property supports applications such as bilevel optimization. The adaptive algorithm's lack of prior parameter knowledge addresses a common practical limitation.

major comments (2)

[§5] §5 (discretization and PEP transfer): the analysis transfers the O(ε^{-7/4}) rate from the continuous ODE to the discrete algorithm via PEP but does not explicitly bound or cancel discretization error terms; without such bounds the rate preservation for the discrete iterates is not guaranteed and requires additional justification.
[§6.2] §6.2 (adaptive Lipschitz estimation): the second algorithm's adaptive estimation procedure is claimed to preserve the O(ε^{-7/4}) bound on function/gradient evaluations, yet the number of extra evaluations required for the estimation is not bounded in the PEP formulation, which could inflate the total complexity.

minor comments (2)

[Notation] Notation for the new ODE (Eq. (3) or equivalent) uses several auxiliary functions whose definitions are spread across sections; a consolidated table of symbols would improve readability.
[Numerical experiments] Figure 2 (numerical comparison) lacks error bars or multiple random seeds, making it difficult to assess whether the reported advantage over restart-based baselines is statistically consistent.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments on our manuscript. We address the two major comments below, proposing revisions to strengthen the presentation and analysis.

read point-by-point responses

Referee: [§5] §5 (discretization and PEP transfer): the analysis transfers the O(ε^{-7/4}) rate from the continuous ODE to the discrete algorithm via PEP but does not explicitly bound or cancel discretization error terms; without such bounds the rate preservation for the discrete iterates is not guaranteed and requires additional justification.

Authors: We appreciate the referee pointing out the need for clearer justification. The PEP analysis is formulated directly on the discrete algorithm's update rules (in parallel with the continuous ODE), so that the rate is obtained without relying on a transfer from the continuous-time bound. Nevertheless, to make this explicit and address the concern about discretization error terms, we will revise §5 to include a dedicated paragraph explaining why no additional error bounds are required in the PEP setup and how the discrete dynamics are captured exactly. revision: yes
Referee: [§6.2] §6.2 (adaptive Lipschitz estimation): the second algorithm's adaptive estimation procedure is claimed to preserve the O(ε^{-7/4}) bound on function/gradient evaluations, yet the number of extra evaluations required for the estimation is not bounded in the PEP formulation, which could inflate the total complexity.

Authors: We thank the referee for this observation. The adaptive estimation steps are part of the algorithm, but we agree that their contribution to the total number of evaluations must be bounded explicitly within the PEP framework. We will revise §6.2 to augment the PEP formulation with the estimation procedure and derive a bound showing that the extra evaluations are absorbed into the overall O(ε^{-7/4}) complexity without inflation. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper introduces a new continuous-time ODE and derives discrete algorithms by discretization, with both continuous and discrete analyses conducted in parallel under the PEP framework. No equations or steps are quoted that reduce the claimed O(ε^{-7/4}) rate to a fitted parameter, self-defined quantity, or self-citation chain by construction. The abstract presents the ODE as newly introduced rather than reverse-engineered from the target rate, and the PEP application is described as transferring properties without additional degrading terms. This leaves the central claim with independent analytical content, consistent with a self-contained derivation against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Review performed on abstract only; no explicit free parameters, axioms beyond the stated Lipschitz assumptions, or invented entities beyond the new ODE are visible.

axioms (1)

domain assumption Objective functions have Lipschitz-continuous gradients and Hessians
Required for the stated complexity and stated in the abstract.

invented entities (1)

New continuous-time ODE model no independent evidence
purpose: To serve as the source from which both discrete algorithms are derived
Described as newly introduced in the abstract

pith-pipeline@v0.9.1-grok · 5691 in / 1286 out tokens · 46667 ms · 2026-06-30T05:22:09.073520+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

65 extracted references · 32 canonical work pages · 2 internal anchors

[1]

Springer Optimization and Its Applications, vol

Nesterov, Y .: Lectures on Convex Optimization. Springer Optimization and Its Applications, vol. 137, p. 589. Springer, Cham (2018). https://doi.org/10.1007/ 978-3-319-91578-4

2018
[2]

In: Proceedings of the 39th Inter- national Conference on Machine Learning, vol

Li, H., Lin, Z.: Restarted nonconvex accelerated gradient descent: No more poly- logarithmic factor in the o(ε−7/4) complexity. In: Proceedings of the 39th Inter- national Conference on Machine Learning, vol. 162, pp. 12901–12916 (2022). https://proceedings.mlr.press/v162/li22o.html

2022
[3]

Marumo, N., Takeda, A.: Parameter-free accelerated gradient descent for noncon- vex minimization. SIAM J. Optim. 34(2), 2093–2120 (2024) https://doi.org/10.1137/ 22M1540934

2093
[4]

Marumo, N., Takeda, A.: Universal heavy-ball method for nonconvex optimization under H ¨older continuous Hessians. Math. Program. 212(1-2), 147–175 (2025) https: //doi.org/10.1007/s10107-024-02100-4 46

work page doi:10.1007/s10107-024-02100-4 2025
[5]

Sharp First-Order Lower Bounds for Higher-Order Smooth Nonconvex Optimization

Zhou, D.: Sharp First-Order Lower Bounds for Higher-Order Smooth Nonconvex Optimization. https://arxiv.org/abs/2606.05438

work page internal anchor Pith review Pith/arXiv arXiv
[6]

Cartis, C., Gould, N.I.M., Toint, P..L.: On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problems. SIAM J. Optim. 20(6), 2833–2852 (2010) https://doi.org/10.1137/090774100

work page doi:10.1137/090774100 2010
[7]

Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding sta- tionary points I. Math. Program. 184(1-2), 71–120 (2020) https://doi.org/10.1007/ s10107-019-01406-y

2020
[8]

Ochs, P., Chen, Y ., Brox, T., Pock, T.: iPiano: inertial proximal algorithm for nonconvex optimization. SIAM J. Imaging Sci. 7(2), 1388–1419 (2014) https://doi.org/10.1137/ 130942954

2014
[9]

O’Neill, M., Wright, S.J.: Behavior of accelerated gradient methods near critical points of nonconvex functions. Math. Program. 176(1-2), 403–427 (2019) https://doi.org/10. 1007/s10107-018-1340-y

2019
[10]

In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp

Agarwal, N., Allen-Zhu, Z., Bullins, B., Hazan, E., Ma, T.: Finding approximate local minima faster than gradient descent. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1195–1199 (2017). https://doi.org/10.1145/ 3055399.3055464

work page arXiv 2017
[11]

Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Accelerated methods for noncon- vex optimization. SIAM J. Optim. 28(2), 1751–1772 (2018) https://doi.org/10.1137/ 17M1114296

2018
[12]

Royer, C.W., O’Neill, M., Wright, S.J.: A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization. Math. Program. 180(1-2), 451–488 (2020) https://doi.org/10.1007/s10107-019-01362-7

work page doi:10.1007/s10107-019-01362-7 2020
[13]

Royer, C.W., Wright, S.J.: Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim. 28(2), 1448–1477 (2018) https: //doi.org/10.1137/17M1134329

work page doi:10.1137/17m1134329 2018
[14]

NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization

Xu, Y ., Jin, R., Yang, T.: Neon+: Accelerated gradient methods for extracting negative curvature for non-convex optimization. arXiv preprint arXiv:1712.01033 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[15]

In: Advances in Neural Information Processing Systems, vol

Allen-Zhu, Z., Li, Y .: Neon2: Finding local minima via first-order oracles. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

2018
[16]

In: Proceedings of the 31st Conference On Learning The- ory, vol

Jin, C., Netrapalli, P., Jordan, M.I.: Accelerated gradient descent escapes saddle points faster than gradient descent. In: Proceedings of the 31st Conference On Learning The- ory, vol. 75, pp. 1042–1085 (2018). https://proceedings.mlr.press/v75/jin18a.html

2018
[17]

Convex until proven guilty

Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: “Convex until proven guilty”: 47 Dimension-free acceleration of gradient descent on non-convex functions. In: Proceed- ings of the 34th International Conference on Machine Learning, vol. 70, pp. 654–663 (2017). https://proceedings.mlr.press/v70/carmon17a.html

2017
[18]

In: Pro- ceedings of the 57th Annual ACM Symposium on Theory of Computing, pp

Jiang, R., Mokhtari, A., Patitucci, F.: Improved complexity for smooth nonconvex optimization: A two-level online learning approach with quasi-newton methods. In: Pro- ceedings of the 57th Annual ACM Symposium on Theory of Computing, pp. 2225–2236 (2025). https://doi.org/10.1145/3717823.3718308

work page doi:10.1145/3717823.3718308 2025
[19]

Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding stationary points II: first-order methods. Math. Program.185(1-2), 315–355 (2021) https://doi.org/ 10.1007/s10107-019-01431-x

work page doi:10.1007/s10107-019-01431-x 2021
[20]

In: Advances in Neural Information Processing Systems, vol

Andrychowicz, M., Denil, M., G ´omez, S., Hoffman, M.W., Pfau, D., Schaul, T., Shillingford, B., Freitas, N.: Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

2016
[21]

In: Proceedings of the 34th International Conference on Machine Learn- ing, vol

Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learn- ing, vol. 70, pp. 1126–1135 (2017).https://proceedings.mlr.press/v70/finn17a.html

2017
[22]

In: International Conference on Learning Representations (2017)

Metz, L., Poole, B., Pfau, D., Sohl-Dickstein, J.: Unrolled generative adversar- ial networks. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=BydrOIcle

2017
[23]

In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp

Gregor, K., LeCun, Y .: Learning fast approximations of sparse coding. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 399–406 (2010)

2010
[24]

In: Advances in Neural Information Processing Systems, vol

Yang, Y ., Sun, J., Li, H., Xu, Z.: Deep ADMM-Net for compressive sensing mri. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

2016
[25]

USSR Comput

Polyak, B.T.: Some methods of speeding up the convergence of iterative methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964) https://doi.org/10.1016/0041-5553(64) 90137-5

work page doi:10.1016/0041-5553(64 1964
[26]

Su, W., Boyd, S., Cand `es, E.J.: A differential equation for modeling Nesterov’s accel- erated gradient method: theory and insights. J. Mach. Learn. Res. 17(153), 1–43 (2016)

2016
[27]

Nesterov, Y .E.: A method for solving the convex programming problem with conver- gence rate O(1/k2). Dokl. Akad. Nauk SSSR 269(3), 543–547 (1983)

1983
[28]

In: Advances in Neural Information Processing Systems, vol

Krichene, W., Bayen, A., Bartlett, P.L.: Accelerated mirror descent in continuous and discrete time. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

2015
[29]

Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon 48 via high-resolution differential equations. Math. Program. 195(1-2), 79–148 (2022) https://doi.org/10.1007/s10107-021-01681-8

work page doi:10.1007/s10107-021-01681-8 2022
[30]

Wilson, A.C., Recht, B., Jordan, M.I.: A Lyapunov analysis of accelerated methods in optimization. J. Mach. Learn. Res. 22(113), 1–34 (2021)

2021
[31]

arXiv preprint arXiv:2406.06100 (2024)

Okamura, K., Marumo, N., Takeda, A.: Primitive heavy-ball dynamics achieves o(ε−7/4) convergence for nonconvex optimization. arXiv preprint arXiv:2406.06100 (2024)

work page arXiv 2024
[32]

In: Advances in Neural Information Processing Systems, vol

Ushiyama, K., Sato, S., Matsuo, T.: A unified discretization framework for differential equation approach with Lyapunov arguments for convex optimiza- tion. In: Advances in Neural Information Processing Systems, vol. 37 (2023). https://openreview.net/forum?id=8YN62t19AW

2023
[33]

Foundations and Trends in Optimization 5(1-2), 1–245 (2021) https://doi.org/10.1561/2400000036

d’Aspremont, A., Scieur, D., Taylor, A.: Acceleration methods. Foundations and Trends in Optimization 5(1-2), 1–245 (2021) https://doi.org/10.1561/2400000036

work page doi:10.1561/2400000036 2021
[34]

In: International Conference on Machine Learning (2018)

Taylor, A., Van Scoy, B., Lessard, L.: Lyapunov functions for first-order methods: Tight automated convergence guarantees. In: International Conference on Machine Learning (2018). https://proceedings.mlr.press/v80/taylor18a.html

2018
[35]

In: Proceedings of the Thirty- Second Conference on Learning Theory, vol

Taylor, A., Bach, F.: Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. In: Proceedings of the Thirty- Second Conference on Learning Theory, vol. 99, pp. 2934–2992 (2019). https://proceedings.mlr.press/v99/taylor19a.html

2019
[36]

Upadhyaya, M., Banert, S., Taylor, A.B., Giselsson, P.: Automated tight Lyapunov anal- ysis for first-order methods. Math. Program. 209(1-2), 133–170 (2025) https://doi.org/ 10.1007/s10107-024-02061-8

work page doi:10.1007/s10107-024-02061-8 2025
[37]

In: International Conference on Machine Learning, pp

Suh, J.J., Roh, G., Ryu, E.K.: Continuous-time analysis of accelerated gra- dient methods via conservation laws in dilated coordinate systems. In: International Conference on Machine Learning, pp. 20640–20667 (2022). https://proceedings.mlr.press/v162/suh22a.html

2022
[38]

Moucer, C., Taylor, A., Bach, F.: A systematic approach to Lyapunov analyses of continuous-time models in convex optimization. SIAM J. Optim. 33(3), 1558–1586 (2023) https://doi.org/10.1137/22M1498486

work page doi:10.1137/22m1498486 2023
[39]

JSIAM Letters 16, 29–32 (2024) https://doi.org/10.14495/jsiaml.16.29

Kamijima, T., Sato, S., Ushiyama, K., Matsuo, T., Tanaka, K.: Analysis of continuous dynamical system models with hessians derived from optimization methods. JSIAM Letters 16, 29–32 (2024) https://doi.org/10.14495/jsiaml.16.29

work page doi:10.14495/jsiaml.16.29 2024
[40]

Drori, Y ., Teboulle, M.: Performance of first-order methods for smooth convex mini- mization: a novel approach. Math. Program. 145(1-2, Ser. A), 451–482 (2014) https: //doi.org/10.1007/s10107-013-0653-0 49

work page doi:10.1007/s10107-013-0653-0 2014
[41]

Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. 161(1-2), 307– 345 (2017) https://doi.org/10.1007/s10107-016-1009-3

work page doi:10.1007/s10107-016-1009-3 2017
[42]

In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp

Taylor, A.B., Hendrickx, J.M., Glineur, F.: Performance estimation toolbox (PESTO): Automated worst-case analysis of first-order optimization methods. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 1278–1283 (2017). https://doi. org/10.1109/CDC.2017.8263832

work page doi:10.1109/cdc.2017.8263832 2017
[43]

arXiv preprint arXiv:2201.04040 (2022)

Goujaud, B., Moucer, C., Glineur, F., Hendrickx, J., Taylor, A., Dieuleveut, A.: PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python. arXiv preprint arXiv:2201.04040 (2022)

work page arXiv 2022
[44]

Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex mini- mization. Math. Program. 159(1-2, Ser. A), 81–107 (2016) https://doi.org/10.1007/ s10107-015-0949-3

2016
[45]

Taylor, A., Drori, Y .: An optimal gradient method for smooth strongly convex minimiza- tion. Math. Program., 1–38 (2022) https://doi.org/10.1007/s10107-022-01839-y

work page doi:10.1007/s10107-022-01839-y 2022
[46]

Jang, U., Gupta, S.D., Ryu, E.K.: Computer-assisted design of accelerated compos- ite optimization methods: OptISTA. Math. Program. in press (2025) https://doi.org/10. 1007/s10107-025-02258-5

2025
[47]

Drori, Y ., Teboulle, M.: An optimal variant of Kelley’s cutting-plane method. Math. Program. 160(1-2), 321–351 (2016) https://doi.org/10.1007/s10107-016-0985-7

work page doi:10.1007/s10107-016-0985-7 2016
[48]

Kim, D.: Accelerated proximal point method for maximally monotone operators. Math. Program. 190(1-2), 57–87 (2021) https://doi.org/10.1007/s10107-021-01643-0

work page doi:10.1007/s10107-021-01643-0 2021
[49]

Lieder, F.: On the convergence rate of the Halpern-iteration. Optim. Lett.15(2), 405–418 (2021) https://doi.org/10.1007/s11590-020-01617-9

work page doi:10.1007/s11590-020-01617-9 2021
[50]

In: Proceedings of the 39th International Conference on Machine Learning, vol

Park, J., Ryu, E.K.: Exact optimal accelerated complexity for fixed-point iterations. In: Proceedings of the 39th International Conference on Machine Learning, vol. 162, pp. 17420–17457 (2022). https://proceedings.mlr.press/v162/park22c.html

2022
[51]

Kim, D., Fessler, J.A.: Optimizing the efficiency of first-order methods for decreas- ing the gradient of smooth convex functions. J. Optim. Theory Appl. 188(1), 192–219 (2021) https://doi.org/10.1007/s10957-020-01770-2

work page doi:10.1007/s10957-020-01770-2 2021
[52]

In: Proceedings of the 38th International Conference on Machine Learning, vol

Yoon, T., Ryu, E.K.: Accelerated algorithms for smooth convex-concave minimax problems with o(1/k2) rate on squared gradient norm. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 12098–12109 (2021). https://proceedings.mlr.press/v139/yoon21d.html

2021
[53]

Kim, D., Fessler, J.A.: Another look at the fast iterative shrinkage/thresholding 50 algorithm (FISTA). SIAM J. Optim. 28(1), 223–250 (2018) https://doi.org/10.1137/ 16M108940X

2018
[54]

Klerk, E., Glineur, F.c., Taylor, A.B.: On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions. Optim. Lett.11(7), 1185–1199 (2017) https://doi.org/10.1007/s11590-016-1087-4

work page doi:10.1007/s11590-016-1087-4 2017
[55]

In: Proceedings of the 37th International Conference on Machine Learning, vol

Drori, Y ., Shamir, O.: The complexity of finding stationary points with stochastic gradi- ent descent. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 2658–2667 (2020). https://proceedings.mlr.press/v119/drori20a.html

2020
[56]

Das Gupta, S., Freund, R.M., Sun, X.A., Taylor, A.: Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses. Math. Program. 213(1-2), 1–49 (2025) https://doi.org/10.1007/s10107-024-02127-7

work page doi:10.1007/s10107-024-02127-7 2025
[57]

Dragomir, R.-A., Taylor, A.B., d’Aspremont, A., Bolte, J.: Optimal complexity and certification of Bregman first-order methods. Math. Program. 194(1-2), 41–83 (2022) https://doi.org/10.1007/s10107-021-01618-1

work page doi:10.1007/s10107-021-01618-1 2022
[58]

Ryu, E.K., Taylor, A.B., Bergeling, C., Giselsson, P.: Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 30(3), 2251–2271 (2020) https://doi.org/10.1137/19M1304854

work page doi:10.1137/19m1304854 2020
[59]

Klerk, E., Glineur, F.c., Taylor, A.B.: Worst-case convergence analysis of inexact gra- dient and Newton methods through semidefinite programming performance estimation. SIAM J. Optim. 30(3), 2053–2082 (2020) https://doi.org/10.1137/19M1281368

work page doi:10.1137/19m1281368 2053
[60]

Barr ´e, M., Taylor, A.B., Bach, F.: Principled analyses and design of first-order methods with inexact proximal operators. Math. Program. 201(1-2), 185–230 (2023) https://doi. org/10.1007/s10107-022-01903-7

work page doi:10.1007/s10107-022-01903-7 2023
[61]

Das Gupta, S., Van Parys, B.P.G., Ryu, E.K.: Branch-and-bound performance esti- mation programming: a unified methodology for constructing optimal optimiza- tion methods. Math. Program. 204(1-2), 567–639 (2024) https://doi.org/10.1007/ s10107-023-01973-1

2024
[62]

In: Advances in Neural Information Processing Systems, vol

Kim, J., Yang, I.: Convergence analysis of ODE models for accelerated first-order meth- ods via positive semidefinite kernels. In: Advances in Neural Information Processing Systems, vol. 37 (2023). https://openreview.net/forum?id=XFE6zpevLc

2023
[63]

METR 2024-02 (2024)

Ushiyama, K., Sato, S., Matsuo, T.: Deriving optimal rates of continuous-time acceler- ated first-order methods via performance estimation problems. METR 2024-02 (2024)

2024
[64]

arXiv preprint arXiv:2504.17193 (2025) 51

Bot ¸, R.I., Dao, M.N., Liu, T., Lourenc ¸o, B.F., Marumo, N.: On the equiva- lence of a hessian-free inequality and lipschitz continuous hessian. arXiv preprint arXiv:2504.17193 (2025) 51

work page arXiv 2025
[65]

International Journal of Mathematical Modelling and Numerical Optimisation 4(2), 150–194 (2013) 52

Jamil, M., Yang, X.-S.: A literature survey of benchmark functions for global opti- misation problems. International Journal of Mathematical Modelling and Numerical Optimisation 4(2), 150–194 (2013) 52

2013

[1] [1]

Springer Optimization and Its Applications, vol

Nesterov, Y .: Lectures on Convex Optimization. Springer Optimization and Its Applications, vol. 137, p. 589. Springer, Cham (2018). https://doi.org/10.1007/ 978-3-319-91578-4

2018

[2] [2]

In: Proceedings of the 39th Inter- national Conference on Machine Learning, vol

Li, H., Lin, Z.: Restarted nonconvex accelerated gradient descent: No more poly- logarithmic factor in the o(ε−7/4) complexity. In: Proceedings of the 39th Inter- national Conference on Machine Learning, vol. 162, pp. 12901–12916 (2022). https://proceedings.mlr.press/v162/li22o.html

2022

[3] [3]

Marumo, N., Takeda, A.: Parameter-free accelerated gradient descent for noncon- vex minimization. SIAM J. Optim. 34(2), 2093–2120 (2024) https://doi.org/10.1137/ 22M1540934

2093

[4] [4]

Marumo, N., Takeda, A.: Universal heavy-ball method for nonconvex optimization under H ¨older continuous Hessians. Math. Program. 212(1-2), 147–175 (2025) https: //doi.org/10.1007/s10107-024-02100-4 46

work page doi:10.1007/s10107-024-02100-4 2025

[5] [5]

Sharp First-Order Lower Bounds for Higher-Order Smooth Nonconvex Optimization

Zhou, D.: Sharp First-Order Lower Bounds for Higher-Order Smooth Nonconvex Optimization. https://arxiv.org/abs/2606.05438

work page internal anchor Pith review Pith/arXiv arXiv

[6] [6]

Cartis, C., Gould, N.I.M., Toint, P..L.: On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problems. SIAM J. Optim. 20(6), 2833–2852 (2010) https://doi.org/10.1137/090774100

work page doi:10.1137/090774100 2010

[7] [7]

Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding sta- tionary points I. Math. Program. 184(1-2), 71–120 (2020) https://doi.org/10.1007/ s10107-019-01406-y

2020

[8] [8]

Ochs, P., Chen, Y ., Brox, T., Pock, T.: iPiano: inertial proximal algorithm for nonconvex optimization. SIAM J. Imaging Sci. 7(2), 1388–1419 (2014) https://doi.org/10.1137/ 130942954

2014

[9] [9]

O’Neill, M., Wright, S.J.: Behavior of accelerated gradient methods near critical points of nonconvex functions. Math. Program. 176(1-2), 403–427 (2019) https://doi.org/10. 1007/s10107-018-1340-y

2019

[10] [10]

In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp

Agarwal, N., Allen-Zhu, Z., Bullins, B., Hazan, E., Ma, T.: Finding approximate local minima faster than gradient descent. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1195–1199 (2017). https://doi.org/10.1145/ 3055399.3055464

work page arXiv 2017

[11] [11]

Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Accelerated methods for noncon- vex optimization. SIAM J. Optim. 28(2), 1751–1772 (2018) https://doi.org/10.1137/ 17M1114296

2018

[12] [12]

Royer, C.W., O’Neill, M., Wright, S.J.: A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization. Math. Program. 180(1-2), 451–488 (2020) https://doi.org/10.1007/s10107-019-01362-7

work page doi:10.1007/s10107-019-01362-7 2020

[13] [13]

Royer, C.W., Wright, S.J.: Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim. 28(2), 1448–1477 (2018) https: //doi.org/10.1137/17M1134329

work page doi:10.1137/17m1134329 2018

[14] [14]

NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization

Xu, Y ., Jin, R., Yang, T.: Neon+: Accelerated gradient methods for extracting negative curvature for non-convex optimization. arXiv preprint arXiv:1712.01033 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[15] [15]

In: Advances in Neural Information Processing Systems, vol

Allen-Zhu, Z., Li, Y .: Neon2: Finding local minima via first-order oracles. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

2018

[16] [16]

In: Proceedings of the 31st Conference On Learning The- ory, vol

Jin, C., Netrapalli, P., Jordan, M.I.: Accelerated gradient descent escapes saddle points faster than gradient descent. In: Proceedings of the 31st Conference On Learning The- ory, vol. 75, pp. 1042–1085 (2018). https://proceedings.mlr.press/v75/jin18a.html

2018

[17] [17]

Convex until proven guilty

Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: “Convex until proven guilty”: 47 Dimension-free acceleration of gradient descent on non-convex functions. In: Proceed- ings of the 34th International Conference on Machine Learning, vol. 70, pp. 654–663 (2017). https://proceedings.mlr.press/v70/carmon17a.html

2017

[18] [18]

In: Pro- ceedings of the 57th Annual ACM Symposium on Theory of Computing, pp

Jiang, R., Mokhtari, A., Patitucci, F.: Improved complexity for smooth nonconvex optimization: A two-level online learning approach with quasi-newton methods. In: Pro- ceedings of the 57th Annual ACM Symposium on Theory of Computing, pp. 2225–2236 (2025). https://doi.org/10.1145/3717823.3718308

work page doi:10.1145/3717823.3718308 2025

[19] [19]

Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding stationary points II: first-order methods. Math. Program.185(1-2), 315–355 (2021) https://doi.org/ 10.1007/s10107-019-01431-x

work page doi:10.1007/s10107-019-01431-x 2021

[20] [20]

In: Advances in Neural Information Processing Systems, vol

Andrychowicz, M., Denil, M., G ´omez, S., Hoffman, M.W., Pfau, D., Schaul, T., Shillingford, B., Freitas, N.: Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

2016

[21] [21]

In: Proceedings of the 34th International Conference on Machine Learn- ing, vol

Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learn- ing, vol. 70, pp. 1126–1135 (2017).https://proceedings.mlr.press/v70/finn17a.html

2017

[22] [22]

In: International Conference on Learning Representations (2017)

Metz, L., Poole, B., Pfau, D., Sohl-Dickstein, J.: Unrolled generative adversar- ial networks. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=BydrOIcle

2017

[23] [23]

In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp

Gregor, K., LeCun, Y .: Learning fast approximations of sparse coding. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 399–406 (2010)

2010

[24] [24]

In: Advances in Neural Information Processing Systems, vol

Yang, Y ., Sun, J., Li, H., Xu, Z.: Deep ADMM-Net for compressive sensing mri. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

2016

[25] [25]

USSR Comput

Polyak, B.T.: Some methods of speeding up the convergence of iterative methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964) https://doi.org/10.1016/0041-5553(64) 90137-5

work page doi:10.1016/0041-5553(64 1964

[26] [26]

Su, W., Boyd, S., Cand `es, E.J.: A differential equation for modeling Nesterov’s accel- erated gradient method: theory and insights. J. Mach. Learn. Res. 17(153), 1–43 (2016)

2016

[27] [27]

Nesterov, Y .E.: A method for solving the convex programming problem with conver- gence rate O(1/k2). Dokl. Akad. Nauk SSSR 269(3), 543–547 (1983)

1983

[28] [28]

In: Advances in Neural Information Processing Systems, vol

Krichene, W., Bayen, A., Bartlett, P.L.: Accelerated mirror descent in continuous and discrete time. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

2015

[29] [29]

Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon 48 via high-resolution differential equations. Math. Program. 195(1-2), 79–148 (2022) https://doi.org/10.1007/s10107-021-01681-8

work page doi:10.1007/s10107-021-01681-8 2022

[30] [30]

Wilson, A.C., Recht, B., Jordan, M.I.: A Lyapunov analysis of accelerated methods in optimization. J. Mach. Learn. Res. 22(113), 1–34 (2021)

2021

[31] [31]

arXiv preprint arXiv:2406.06100 (2024)

Okamura, K., Marumo, N., Takeda, A.: Primitive heavy-ball dynamics achieves o(ε−7/4) convergence for nonconvex optimization. arXiv preprint arXiv:2406.06100 (2024)

work page arXiv 2024

[32] [32]

In: Advances in Neural Information Processing Systems, vol

Ushiyama, K., Sato, S., Matsuo, T.: A unified discretization framework for differential equation approach with Lyapunov arguments for convex optimiza- tion. In: Advances in Neural Information Processing Systems, vol. 37 (2023). https://openreview.net/forum?id=8YN62t19AW

2023

[33] [33]

Foundations and Trends in Optimization 5(1-2), 1–245 (2021) https://doi.org/10.1561/2400000036

d’Aspremont, A., Scieur, D., Taylor, A.: Acceleration methods. Foundations and Trends in Optimization 5(1-2), 1–245 (2021) https://doi.org/10.1561/2400000036

work page doi:10.1561/2400000036 2021

[34] [34]

In: International Conference on Machine Learning (2018)

Taylor, A., Van Scoy, B., Lessard, L.: Lyapunov functions for first-order methods: Tight automated convergence guarantees. In: International Conference on Machine Learning (2018). https://proceedings.mlr.press/v80/taylor18a.html

2018

[35] [35]

In: Proceedings of the Thirty- Second Conference on Learning Theory, vol

Taylor, A., Bach, F.: Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. In: Proceedings of the Thirty- Second Conference on Learning Theory, vol. 99, pp. 2934–2992 (2019). https://proceedings.mlr.press/v99/taylor19a.html

2019

[36] [36]

Upadhyaya, M., Banert, S., Taylor, A.B., Giselsson, P.: Automated tight Lyapunov anal- ysis for first-order methods. Math. Program. 209(1-2), 133–170 (2025) https://doi.org/ 10.1007/s10107-024-02061-8

work page doi:10.1007/s10107-024-02061-8 2025

[37] [37]

In: International Conference on Machine Learning, pp

Suh, J.J., Roh, G., Ryu, E.K.: Continuous-time analysis of accelerated gra- dient methods via conservation laws in dilated coordinate systems. In: International Conference on Machine Learning, pp. 20640–20667 (2022). https://proceedings.mlr.press/v162/suh22a.html

2022

[38] [38]

Moucer, C., Taylor, A., Bach, F.: A systematic approach to Lyapunov analyses of continuous-time models in convex optimization. SIAM J. Optim. 33(3), 1558–1586 (2023) https://doi.org/10.1137/22M1498486

work page doi:10.1137/22m1498486 2023

[39] [39]

JSIAM Letters 16, 29–32 (2024) https://doi.org/10.14495/jsiaml.16.29

Kamijima, T., Sato, S., Ushiyama, K., Matsuo, T., Tanaka, K.: Analysis of continuous dynamical system models with hessians derived from optimization methods. JSIAM Letters 16, 29–32 (2024) https://doi.org/10.14495/jsiaml.16.29

work page doi:10.14495/jsiaml.16.29 2024

[40] [40]

Drori, Y ., Teboulle, M.: Performance of first-order methods for smooth convex mini- mization: a novel approach. Math. Program. 145(1-2, Ser. A), 451–482 (2014) https: //doi.org/10.1007/s10107-013-0653-0 49

work page doi:10.1007/s10107-013-0653-0 2014

[41] [41]

Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. 161(1-2), 307– 345 (2017) https://doi.org/10.1007/s10107-016-1009-3

work page doi:10.1007/s10107-016-1009-3 2017

[42] [42]

In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp

Taylor, A.B., Hendrickx, J.M., Glineur, F.: Performance estimation toolbox (PESTO): Automated worst-case analysis of first-order optimization methods. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 1278–1283 (2017). https://doi. org/10.1109/CDC.2017.8263832

work page doi:10.1109/cdc.2017.8263832 2017

[43] [43]

arXiv preprint arXiv:2201.04040 (2022)

Goujaud, B., Moucer, C., Glineur, F., Hendrickx, J., Taylor, A., Dieuleveut, A.: PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python. arXiv preprint arXiv:2201.04040 (2022)

work page arXiv 2022

[44] [44]

Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex mini- mization. Math. Program. 159(1-2, Ser. A), 81–107 (2016) https://doi.org/10.1007/ s10107-015-0949-3

2016

[45] [45]

Taylor, A., Drori, Y .: An optimal gradient method for smooth strongly convex minimiza- tion. Math. Program., 1–38 (2022) https://doi.org/10.1007/s10107-022-01839-y

work page doi:10.1007/s10107-022-01839-y 2022

[46] [46]

Jang, U., Gupta, S.D., Ryu, E.K.: Computer-assisted design of accelerated compos- ite optimization methods: OptISTA. Math. Program. in press (2025) https://doi.org/10. 1007/s10107-025-02258-5

2025

[47] [47]

Drori, Y ., Teboulle, M.: An optimal variant of Kelley’s cutting-plane method. Math. Program. 160(1-2), 321–351 (2016) https://doi.org/10.1007/s10107-016-0985-7

work page doi:10.1007/s10107-016-0985-7 2016

[48] [48]

Kim, D.: Accelerated proximal point method for maximally monotone operators. Math. Program. 190(1-2), 57–87 (2021) https://doi.org/10.1007/s10107-021-01643-0

work page doi:10.1007/s10107-021-01643-0 2021

[49] [49]

Lieder, F.: On the convergence rate of the Halpern-iteration. Optim. Lett.15(2), 405–418 (2021) https://doi.org/10.1007/s11590-020-01617-9

work page doi:10.1007/s11590-020-01617-9 2021

[50] [50]

In: Proceedings of the 39th International Conference on Machine Learning, vol

Park, J., Ryu, E.K.: Exact optimal accelerated complexity for fixed-point iterations. In: Proceedings of the 39th International Conference on Machine Learning, vol. 162, pp. 17420–17457 (2022). https://proceedings.mlr.press/v162/park22c.html

2022

[51] [51]

Kim, D., Fessler, J.A.: Optimizing the efficiency of first-order methods for decreas- ing the gradient of smooth convex functions. J. Optim. Theory Appl. 188(1), 192–219 (2021) https://doi.org/10.1007/s10957-020-01770-2

work page doi:10.1007/s10957-020-01770-2 2021

[52] [52]

In: Proceedings of the 38th International Conference on Machine Learning, vol

Yoon, T., Ryu, E.K.: Accelerated algorithms for smooth convex-concave minimax problems with o(1/k2) rate on squared gradient norm. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 12098–12109 (2021). https://proceedings.mlr.press/v139/yoon21d.html

2021

[53] [53]

Kim, D., Fessler, J.A.: Another look at the fast iterative shrinkage/thresholding 50 algorithm (FISTA). SIAM J. Optim. 28(1), 223–250 (2018) https://doi.org/10.1137/ 16M108940X

2018

[54] [54]

Klerk, E., Glineur, F.c., Taylor, A.B.: On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions. Optim. Lett.11(7), 1185–1199 (2017) https://doi.org/10.1007/s11590-016-1087-4

work page doi:10.1007/s11590-016-1087-4 2017

[55] [55]

In: Proceedings of the 37th International Conference on Machine Learning, vol

Drori, Y ., Shamir, O.: The complexity of finding stationary points with stochastic gradi- ent descent. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 2658–2667 (2020). https://proceedings.mlr.press/v119/drori20a.html

2020

[56] [56]

Das Gupta, S., Freund, R.M., Sun, X.A., Taylor, A.: Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses. Math. Program. 213(1-2), 1–49 (2025) https://doi.org/10.1007/s10107-024-02127-7

work page doi:10.1007/s10107-024-02127-7 2025

[57] [57]

Dragomir, R.-A., Taylor, A.B., d’Aspremont, A., Bolte, J.: Optimal complexity and certification of Bregman first-order methods. Math. Program. 194(1-2), 41–83 (2022) https://doi.org/10.1007/s10107-021-01618-1

work page doi:10.1007/s10107-021-01618-1 2022

[58] [58]

Ryu, E.K., Taylor, A.B., Bergeling, C., Giselsson, P.: Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 30(3), 2251–2271 (2020) https://doi.org/10.1137/19M1304854

work page doi:10.1137/19m1304854 2020

[59] [59]

Klerk, E., Glineur, F.c., Taylor, A.B.: Worst-case convergence analysis of inexact gra- dient and Newton methods through semidefinite programming performance estimation. SIAM J. Optim. 30(3), 2053–2082 (2020) https://doi.org/10.1137/19M1281368

work page doi:10.1137/19m1281368 2053

[60] [60]

Barr ´e, M., Taylor, A.B., Bach, F.: Principled analyses and design of first-order methods with inexact proximal operators. Math. Program. 201(1-2), 185–230 (2023) https://doi. org/10.1007/s10107-022-01903-7

work page doi:10.1007/s10107-022-01903-7 2023

[61] [61]

Das Gupta, S., Van Parys, B.P.G., Ryu, E.K.: Branch-and-bound performance esti- mation programming: a unified methodology for constructing optimal optimiza- tion methods. Math. Program. 204(1-2), 567–639 (2024) https://doi.org/10.1007/ s10107-023-01973-1

2024

[62] [62]

In: Advances in Neural Information Processing Systems, vol

Kim, J., Yang, I.: Convergence analysis of ODE models for accelerated first-order meth- ods via positive semidefinite kernels. In: Advances in Neural Information Processing Systems, vol. 37 (2023). https://openreview.net/forum?id=XFE6zpevLc

2023

[63] [63]

METR 2024-02 (2024)

Ushiyama, K., Sato, S., Matsuo, T.: Deriving optimal rates of continuous-time acceler- ated first-order methods via performance estimation problems. METR 2024-02 (2024)

2024

[64] [64]

arXiv preprint arXiv:2504.17193 (2025) 51

Bot ¸, R.I., Dao, M.N., Liu, T., Lourenc ¸o, B.F., Marumo, N.: On the equiva- lence of a hessian-free inequality and lipschitz continuous hessian. arXiv preprint arXiv:2504.17193 (2025) 51

work page arXiv 2025

[65] [65]

International Journal of Mathematical Modelling and Numerical Optimisation 4(2), 150–194 (2013) 52

Jamil, M., Yang, X.-S.: A literature survey of benchmark functions for global opti- misation problems. International Journal of Mathematical Modelling and Numerical Optimisation 4(2), 150–194 (2013) 52

2013