A Restart-Free Accelerated Algorithm for Non-Convex Minimization: Continuous and Discrete Analysis
Pith reviewed 2026-06-30 05:22 UTC · model grok-4.3
The pith
Two new first-order methods reach an ε-approximate stationary point for nonconvex problems in O(ε^{-7/4}) evaluations without restarts or prior knowledge of ε.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By discretizing a newly proposed continuous-time ODE model, the authors obtain two restart-free first-order methods that attain the O(ε^{-7/4}) complexity for finding an ε-approximate first-order stationary point of a nonconvex function with Lipschitz gradient and Hessian, without using ε as an input parameter and without any restart mechanism. The continuous- and discrete-time analyses proceed in parallel under the Performance Estimation Problem framework. One version yields a simple implementation whose final iterate remains differentiable with respect to the initial point; the other version estimates the Lipschitz constants adaptively and therefore requires no a priori knowledge of those
What carries the argument
A newly introduced continuous-time ordinary differential equation model whose discretizations produce the algorithms and whose convergence properties transfer to the discrete setting under the Performance Estimation Problem framework.
If this is right
- The first algorithm admits a simple implementation in which the last iterate is differentiable with respect to the initial point.
- The second algorithm estimates the required Lipschitz constants during execution and therefore needs no a priori knowledge of them.
- Both algorithms avoid restart mechanisms while matching the best known complexity for this problem class.
- The continuous and discrete analyses are carried out in parallel inside the same Performance Estimation Problem framework.
Where Pith is reading between the lines
- Because the final iterate of the first algorithm remains differentiable, the method could be embedded inside larger differentiable pipelines such as hyperparameter optimization or end-to-end learning.
- The restart-free design may reduce sensitivity to the choice of restart frequency on problems where the landscape changes across scales.
- The parallel continuous-discrete analysis suggests that similar ODE models could be discretized for other nonconvex problem classes while preserving the PEP-based rate.
- Adaptive estimation of Lipschitz constants without restarts may improve practical robustness on functions where the local smoothness varies.
Load-bearing premise
The convergence guarantees proved for the continuous ODE model carry over to its discretizations inside the Performance Estimation Problem framework without extra error terms that would worsen the rate.
What would settle it
A concrete nonconvex function with known Lipschitz constants on which the proposed discrete algorithm requires more than O(ε^{-7/4}) evaluations to reach an ε-stationary point, or a counter-example showing that the PEP analysis introduces a discretization error that degrades the rate below O(ε^{-7/4}).
read the original abstract
We propose two novel first-order methods for minimizing nonconvex functions with Lipschitz-continuous gradients and Hessians. These algorithms attain an $\varepsilon$-approximate first-order stationary point in $\mathrm{O}(\varepsilon^{-7/4})$ function and gradient evaluations, without using $\varepsilon$ as an input parameter. While existing methods rely on restart mechanisms to achieve this complexity, our methods do not. Consequently, the first algorithm enjoys a simple implementation, making its last iterate differentiable with respect to the initial point. By estimating the Lipschitz constants adaptively, we develop the second algorithm that does not require prior knowledge of the constants. This algorithm exhibits better numerical performance than existing parameter-free methods for certain problems, which can be attributed to its restart-free design. Both algorithms are derived by discretizing a newly introduced continuous-time model represented by an ordinary differential equation, and their continuous- and discrete-time convergence analyses proceed in a parallel manner under the Performance Estimation Problem framework.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes two novel restart-free first-order methods for non-convex minimization of functions with Lipschitz gradients and Hessians. Derived by discretizing a new continuous-time ODE, the methods achieve an O(ε^{-7/4}) complexity bound for ε-approximate first-order stationary points without restarts or knowledge of ε in advance. One method is simple with a differentiable last iterate; the second adaptively estimates the Lipschitz constants. Convergence analyses for both the continuous ODE and discrete algorithms are carried out in parallel under the Performance Estimation Problem (PEP) framework.
Significance. If the central claims hold, the work is significant for providing restart-free accelerated rates in non-convex optimization, simplifying implementation compared to restart-based methods and enabling better practical performance in the adaptive variant. The parallel continuous-discrete PEP analysis is a methodological contribution, and the differentiability property supports applications such as bilevel optimization. The adaptive algorithm's lack of prior parameter knowledge addresses a common practical limitation.
major comments (2)
- [§5] §5 (discretization and PEP transfer): the analysis transfers the O(ε^{-7/4}) rate from the continuous ODE to the discrete algorithm via PEP but does not explicitly bound or cancel discretization error terms; without such bounds the rate preservation for the discrete iterates is not guaranteed and requires additional justification.
- [§6.2] §6.2 (adaptive Lipschitz estimation): the second algorithm's adaptive estimation procedure is claimed to preserve the O(ε^{-7/4}) bound on function/gradient evaluations, yet the number of extra evaluations required for the estimation is not bounded in the PEP formulation, which could inflate the total complexity.
minor comments (2)
- [Notation] Notation for the new ODE (Eq. (3) or equivalent) uses several auxiliary functions whose definitions are spread across sections; a consolidated table of symbols would improve readability.
- [Numerical experiments] Figure 2 (numerical comparison) lacks error bars or multiple random seeds, making it difficult to assess whether the reported advantage over restart-based baselines is statistically consistent.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments on our manuscript. We address the two major comments below, proposing revisions to strengthen the presentation and analysis.
read point-by-point responses
-
Referee: [§5] §5 (discretization and PEP transfer): the analysis transfers the O(ε^{-7/4}) rate from the continuous ODE to the discrete algorithm via PEP but does not explicitly bound or cancel discretization error terms; without such bounds the rate preservation for the discrete iterates is not guaranteed and requires additional justification.
Authors: We appreciate the referee pointing out the need for clearer justification. The PEP analysis is formulated directly on the discrete algorithm's update rules (in parallel with the continuous ODE), so that the rate is obtained without relying on a transfer from the continuous-time bound. Nevertheless, to make this explicit and address the concern about discretization error terms, we will revise §5 to include a dedicated paragraph explaining why no additional error bounds are required in the PEP setup and how the discrete dynamics are captured exactly. revision: yes
-
Referee: [§6.2] §6.2 (adaptive Lipschitz estimation): the second algorithm's adaptive estimation procedure is claimed to preserve the O(ε^{-7/4}) bound on function/gradient evaluations, yet the number of extra evaluations required for the estimation is not bounded in the PEP formulation, which could inflate the total complexity.
Authors: We thank the referee for this observation. The adaptive estimation steps are part of the algorithm, but we agree that their contribution to the total number of evaluations must be bounded explicitly within the PEP framework. We will revise §6.2 to augment the PEP formulation with the estimation procedure and derive a bound showing that the extra evaluations are absorbed into the overall O(ε^{-7/4}) complexity without inflation. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper introduces a new continuous-time ODE and derives discrete algorithms by discretization, with both continuous and discrete analyses conducted in parallel under the PEP framework. No equations or steps are quoted that reduce the claimed O(ε^{-7/4}) rate to a fitted parameter, self-defined quantity, or self-citation chain by construction. The abstract presents the ODE as newly introduced rather than reverse-engineered from the target rate, and the PEP application is described as transferring properties without additional degrading terms. This leaves the central claim with independent analytical content, consistent with a self-contained derivation against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Objective functions have Lipschitz-continuous gradients and Hessians
invented entities (1)
-
New continuous-time ODE model
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Springer Optimization and Its Applications, vol
Nesterov, Y .: Lectures on Convex Optimization. Springer Optimization and Its Applications, vol. 137, p. 589. Springer, Cham (2018). https://doi.org/10.1007/ 978-3-319-91578-4
2018
-
[2]
In: Proceedings of the 39th Inter- national Conference on Machine Learning, vol
Li, H., Lin, Z.: Restarted nonconvex accelerated gradient descent: No more poly- logarithmic factor in the o(ε−7/4) complexity. In: Proceedings of the 39th Inter- national Conference on Machine Learning, vol. 162, pp. 12901–12916 (2022). https://proceedings.mlr.press/v162/li22o.html
2022
-
[3]
Marumo, N., Takeda, A.: Parameter-free accelerated gradient descent for noncon- vex minimization. SIAM J. Optim. 34(2), 2093–2120 (2024) https://doi.org/10.1137/ 22M1540934
2093
-
[4]
Marumo, N., Takeda, A.: Universal heavy-ball method for nonconvex optimization under H ¨older continuous Hessians. Math. Program. 212(1-2), 147–175 (2025) https: //doi.org/10.1007/s10107-024-02100-4 46
-
[5]
Sharp First-Order Lower Bounds for Higher-Order Smooth Nonconvex Optimization
Zhou, D.: Sharp First-Order Lower Bounds for Higher-Order Smooth Nonconvex Optimization. https://arxiv.org/abs/2606.05438
work page internal anchor Pith review Pith/arXiv arXiv
-
[6]
Cartis, C., Gould, N.I.M., Toint, P..L.: On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problems. SIAM J. Optim. 20(6), 2833–2852 (2010) https://doi.org/10.1137/090774100
-
[7]
Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding sta- tionary points I. Math. Program. 184(1-2), 71–120 (2020) https://doi.org/10.1007/ s10107-019-01406-y
2020
-
[8]
Ochs, P., Chen, Y ., Brox, T., Pock, T.: iPiano: inertial proximal algorithm for nonconvex optimization. SIAM J. Imaging Sci. 7(2), 1388–1419 (2014) https://doi.org/10.1137/ 130942954
2014
-
[9]
O’Neill, M., Wright, S.J.: Behavior of accelerated gradient methods near critical points of nonconvex functions. Math. Program. 176(1-2), 403–427 (2019) https://doi.org/10. 1007/s10107-018-1340-y
2019
-
[10]
In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp
Agarwal, N., Allen-Zhu, Z., Bullins, B., Hazan, E., Ma, T.: Finding approximate local minima faster than gradient descent. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1195–1199 (2017). https://doi.org/10.1145/ 3055399.3055464
-
[11]
Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Accelerated methods for noncon- vex optimization. SIAM J. Optim. 28(2), 1751–1772 (2018) https://doi.org/10.1137/ 17M1114296
2018
-
[12]
Royer, C.W., O’Neill, M., Wright, S.J.: A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization. Math. Program. 180(1-2), 451–488 (2020) https://doi.org/10.1007/s10107-019-01362-7
-
[13]
Royer, C.W., Wright, S.J.: Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim. 28(2), 1448–1477 (2018) https: //doi.org/10.1137/17M1134329
-
[14]
NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization
Xu, Y ., Jin, R., Yang, T.: Neon+: Accelerated gradient methods for extracting negative curvature for non-convex optimization. arXiv preprint arXiv:1712.01033 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[15]
In: Advances in Neural Information Processing Systems, vol
Allen-Zhu, Z., Li, Y .: Neon2: Finding local minima via first-order oracles. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
2018
-
[16]
In: Proceedings of the 31st Conference On Learning The- ory, vol
Jin, C., Netrapalli, P., Jordan, M.I.: Accelerated gradient descent escapes saddle points faster than gradient descent. In: Proceedings of the 31st Conference On Learning The- ory, vol. 75, pp. 1042–1085 (2018). https://proceedings.mlr.press/v75/jin18a.html
2018
-
[17]
Convex until proven guilty
Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: “Convex until proven guilty”: 47 Dimension-free acceleration of gradient descent on non-convex functions. In: Proceed- ings of the 34th International Conference on Machine Learning, vol. 70, pp. 654–663 (2017). https://proceedings.mlr.press/v70/carmon17a.html
2017
-
[18]
In: Pro- ceedings of the 57th Annual ACM Symposium on Theory of Computing, pp
Jiang, R., Mokhtari, A., Patitucci, F.: Improved complexity for smooth nonconvex optimization: A two-level online learning approach with quasi-newton methods. In: Pro- ceedings of the 57th Annual ACM Symposium on Theory of Computing, pp. 2225–2236 (2025). https://doi.org/10.1145/3717823.3718308
-
[19]
Carmon, Y ., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding stationary points II: first-order methods. Math. Program.185(1-2), 315–355 (2021) https://doi.org/ 10.1007/s10107-019-01431-x
-
[20]
In: Advances in Neural Information Processing Systems, vol
Andrychowicz, M., Denil, M., G ´omez, S., Hoffman, M.W., Pfau, D., Schaul, T., Shillingford, B., Freitas, N.: Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
2016
-
[21]
In: Proceedings of the 34th International Conference on Machine Learn- ing, vol
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learn- ing, vol. 70, pp. 1126–1135 (2017).https://proceedings.mlr.press/v70/finn17a.html
2017
-
[22]
In: International Conference on Learning Representations (2017)
Metz, L., Poole, B., Pfau, D., Sohl-Dickstein, J.: Unrolled generative adversar- ial networks. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=BydrOIcle
2017
-
[23]
In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp
Gregor, K., LeCun, Y .: Learning fast approximations of sparse coding. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 399–406 (2010)
2010
-
[24]
In: Advances in Neural Information Processing Systems, vol
Yang, Y ., Sun, J., Li, H., Xu, Z.: Deep ADMM-Net for compressive sensing mri. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
2016
-
[25]
Polyak, B.T.: Some methods of speeding up the convergence of iterative methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964) https://doi.org/10.1016/0041-5553(64) 90137-5
-
[26]
Su, W., Boyd, S., Cand `es, E.J.: A differential equation for modeling Nesterov’s accel- erated gradient method: theory and insights. J. Mach. Learn. Res. 17(153), 1–43 (2016)
2016
-
[27]
Nesterov, Y .E.: A method for solving the convex programming problem with conver- gence rate O(1/k2). Dokl. Akad. Nauk SSSR 269(3), 543–547 (1983)
1983
-
[28]
In: Advances in Neural Information Processing Systems, vol
Krichene, W., Bayen, A., Bartlett, P.L.: Accelerated mirror descent in continuous and discrete time. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
2015
-
[29]
Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon 48 via high-resolution differential equations. Math. Program. 195(1-2), 79–148 (2022) https://doi.org/10.1007/s10107-021-01681-8
-
[30]
Wilson, A.C., Recht, B., Jordan, M.I.: A Lyapunov analysis of accelerated methods in optimization. J. Mach. Learn. Res. 22(113), 1–34 (2021)
2021
-
[31]
arXiv preprint arXiv:2406.06100 (2024)
Okamura, K., Marumo, N., Takeda, A.: Primitive heavy-ball dynamics achieves o(ε−7/4) convergence for nonconvex optimization. arXiv preprint arXiv:2406.06100 (2024)
-
[32]
In: Advances in Neural Information Processing Systems, vol
Ushiyama, K., Sato, S., Matsuo, T.: A unified discretization framework for differential equation approach with Lyapunov arguments for convex optimiza- tion. In: Advances in Neural Information Processing Systems, vol. 37 (2023). https://openreview.net/forum?id=8YN62t19AW
2023
-
[33]
Foundations and Trends in Optimization 5(1-2), 1–245 (2021) https://doi.org/10.1561/2400000036
d’Aspremont, A., Scieur, D., Taylor, A.: Acceleration methods. Foundations and Trends in Optimization 5(1-2), 1–245 (2021) https://doi.org/10.1561/2400000036
-
[34]
In: International Conference on Machine Learning (2018)
Taylor, A., Van Scoy, B., Lessard, L.: Lyapunov functions for first-order methods: Tight automated convergence guarantees. In: International Conference on Machine Learning (2018). https://proceedings.mlr.press/v80/taylor18a.html
2018
-
[35]
In: Proceedings of the Thirty- Second Conference on Learning Theory, vol
Taylor, A., Bach, F.: Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. In: Proceedings of the Thirty- Second Conference on Learning Theory, vol. 99, pp. 2934–2992 (2019). https://proceedings.mlr.press/v99/taylor19a.html
2019
-
[36]
Upadhyaya, M., Banert, S., Taylor, A.B., Giselsson, P.: Automated tight Lyapunov anal- ysis for first-order methods. Math. Program. 209(1-2), 133–170 (2025) https://doi.org/ 10.1007/s10107-024-02061-8
-
[37]
In: International Conference on Machine Learning, pp
Suh, J.J., Roh, G., Ryu, E.K.: Continuous-time analysis of accelerated gra- dient methods via conservation laws in dilated coordinate systems. In: International Conference on Machine Learning, pp. 20640–20667 (2022). https://proceedings.mlr.press/v162/suh22a.html
2022
-
[38]
Moucer, C., Taylor, A., Bach, F.: A systematic approach to Lyapunov analyses of continuous-time models in convex optimization. SIAM J. Optim. 33(3), 1558–1586 (2023) https://doi.org/10.1137/22M1498486
-
[39]
JSIAM Letters 16, 29–32 (2024) https://doi.org/10.14495/jsiaml.16.29
Kamijima, T., Sato, S., Ushiyama, K., Matsuo, T., Tanaka, K.: Analysis of continuous dynamical system models with hessians derived from optimization methods. JSIAM Letters 16, 29–32 (2024) https://doi.org/10.14495/jsiaml.16.29
-
[40]
Drori, Y ., Teboulle, M.: Performance of first-order methods for smooth convex mini- mization: a novel approach. Math. Program. 145(1-2, Ser. A), 451–482 (2014) https: //doi.org/10.1007/s10107-013-0653-0 49
-
[41]
Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. 161(1-2), 307– 345 (2017) https://doi.org/10.1007/s10107-016-1009-3
-
[42]
In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp
Taylor, A.B., Hendrickx, J.M., Glineur, F.: Performance estimation toolbox (PESTO): Automated worst-case analysis of first-order optimization methods. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 1278–1283 (2017). https://doi. org/10.1109/CDC.2017.8263832
-
[43]
arXiv preprint arXiv:2201.04040 (2022)
Goujaud, B., Moucer, C., Glineur, F., Hendrickx, J., Taylor, A., Dieuleveut, A.: PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python. arXiv preprint arXiv:2201.04040 (2022)
-
[44]
Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex mini- mization. Math. Program. 159(1-2, Ser. A), 81–107 (2016) https://doi.org/10.1007/ s10107-015-0949-3
2016
-
[45]
Taylor, A., Drori, Y .: An optimal gradient method for smooth strongly convex minimiza- tion. Math. Program., 1–38 (2022) https://doi.org/10.1007/s10107-022-01839-y
-
[46]
Jang, U., Gupta, S.D., Ryu, E.K.: Computer-assisted design of accelerated compos- ite optimization methods: OptISTA. Math. Program. in press (2025) https://doi.org/10. 1007/s10107-025-02258-5
2025
-
[47]
Drori, Y ., Teboulle, M.: An optimal variant of Kelley’s cutting-plane method. Math. Program. 160(1-2), 321–351 (2016) https://doi.org/10.1007/s10107-016-0985-7
-
[48]
Kim, D.: Accelerated proximal point method for maximally monotone operators. Math. Program. 190(1-2), 57–87 (2021) https://doi.org/10.1007/s10107-021-01643-0
-
[49]
Lieder, F.: On the convergence rate of the Halpern-iteration. Optim. Lett.15(2), 405–418 (2021) https://doi.org/10.1007/s11590-020-01617-9
-
[50]
In: Proceedings of the 39th International Conference on Machine Learning, vol
Park, J., Ryu, E.K.: Exact optimal accelerated complexity for fixed-point iterations. In: Proceedings of the 39th International Conference on Machine Learning, vol. 162, pp. 17420–17457 (2022). https://proceedings.mlr.press/v162/park22c.html
2022
-
[51]
Kim, D., Fessler, J.A.: Optimizing the efficiency of first-order methods for decreas- ing the gradient of smooth convex functions. J. Optim. Theory Appl. 188(1), 192–219 (2021) https://doi.org/10.1007/s10957-020-01770-2
-
[52]
In: Proceedings of the 38th International Conference on Machine Learning, vol
Yoon, T., Ryu, E.K.: Accelerated algorithms for smooth convex-concave minimax problems with o(1/k2) rate on squared gradient norm. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 12098–12109 (2021). https://proceedings.mlr.press/v139/yoon21d.html
2021
-
[53]
Kim, D., Fessler, J.A.: Another look at the fast iterative shrinkage/thresholding 50 algorithm (FISTA). SIAM J. Optim. 28(1), 223–250 (2018) https://doi.org/10.1137/ 16M108940X
2018
-
[54]
Klerk, E., Glineur, F.c., Taylor, A.B.: On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions. Optim. Lett.11(7), 1185–1199 (2017) https://doi.org/10.1007/s11590-016-1087-4
-
[55]
In: Proceedings of the 37th International Conference on Machine Learning, vol
Drori, Y ., Shamir, O.: The complexity of finding stationary points with stochastic gradi- ent descent. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 2658–2667 (2020). https://proceedings.mlr.press/v119/drori20a.html
2020
-
[56]
Das Gupta, S., Freund, R.M., Sun, X.A., Taylor, A.: Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses. Math. Program. 213(1-2), 1–49 (2025) https://doi.org/10.1007/s10107-024-02127-7
-
[57]
Dragomir, R.-A., Taylor, A.B., d’Aspremont, A., Bolte, J.: Optimal complexity and certification of Bregman first-order methods. Math. Program. 194(1-2), 41–83 (2022) https://doi.org/10.1007/s10107-021-01618-1
-
[58]
Ryu, E.K., Taylor, A.B., Bergeling, C., Giselsson, P.: Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 30(3), 2251–2271 (2020) https://doi.org/10.1137/19M1304854
-
[59]
Klerk, E., Glineur, F.c., Taylor, A.B.: Worst-case convergence analysis of inexact gra- dient and Newton methods through semidefinite programming performance estimation. SIAM J. Optim. 30(3), 2053–2082 (2020) https://doi.org/10.1137/19M1281368
-
[60]
Barr ´e, M., Taylor, A.B., Bach, F.: Principled analyses and design of first-order methods with inexact proximal operators. Math. Program. 201(1-2), 185–230 (2023) https://doi. org/10.1007/s10107-022-01903-7
-
[61]
Das Gupta, S., Van Parys, B.P.G., Ryu, E.K.: Branch-and-bound performance esti- mation programming: a unified methodology for constructing optimal optimiza- tion methods. Math. Program. 204(1-2), 567–639 (2024) https://doi.org/10.1007/ s10107-023-01973-1
2024
-
[62]
In: Advances in Neural Information Processing Systems, vol
Kim, J., Yang, I.: Convergence analysis of ODE models for accelerated first-order meth- ods via positive semidefinite kernels. In: Advances in Neural Information Processing Systems, vol. 37 (2023). https://openreview.net/forum?id=XFE6zpevLc
2023
-
[63]
METR 2024-02 (2024)
Ushiyama, K., Sato, S., Matsuo, T.: Deriving optimal rates of continuous-time acceler- ated first-order methods via performance estimation problems. METR 2024-02 (2024)
2024
-
[64]
arXiv preprint arXiv:2504.17193 (2025) 51
Bot ¸, R.I., Dao, M.N., Liu, T., Lourenc ¸o, B.F., Marumo, N.: On the equiva- lence of a hessian-free inequality and lipschitz continuous hessian. arXiv preprint arXiv:2504.17193 (2025) 51
-
[65]
International Journal of Mathematical Modelling and Numerical Optimisation 4(2), 150–194 (2013) 52
Jamil, M., Yang, X.-S.: A literature survey of benchmark functions for global opti- misation problems. International Journal of Mathematical Modelling and Numerical Optimisation 4(2), 150–194 (2013) 52
2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.