Inexactly Smooth Performance Estimation and New Optimized Gradient Methods

Aaron Zoll; Benjamin Grimmer

arxiv: 2606.01505 · v2 · pith:EK5XPFHFnew · submitted 2026-06-01 · 🧮 math.OC

Inexactly Smooth Performance Estimation and New Optimized Gradient Methods

Aaron Zoll , Benjamin Grimmer This is my paper

Pith reviewed 2026-06-28 13:56 UTC · model grok-4.3

classification 🧮 math.OC

keywords inexactly smoothperformance estimationHölder smoothconvex optimizationfirst-order methodsinterpolation theoremsgradient methodsbacktracking

0 comments

The pith

Inexactly smooth convex functions have nearly tight interpolation theorems that support optimal first-order method construction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines a broad class of inexactly smooth convex functions that generalizes several standard smoothness notions including L-smooth, M-Lipschitz, and Hölder smooth cases. It shows these functions satisfy interpolation theorems that are necessary and sufficient up to small universal factors. This allows any first-order method to be analyzed exactly by solving a convex performance estimation problem. The same property supports constructing new methods that achieve optimal or best-known rates for Hölder smooth problems and a backtracking method that works without knowing the smoothness parameters in advance.

Core claim

Inexactly smooth convex functions possess interpolation theorems that are necessary and sufficient up to modest universal constants. These enable analysis of first-order methods for any inexactly smooth convex problem class via solving convex Performance Estimation Problems (PEPs) and the extension of Drori and Taylor's constructive approach to algorithm design, from which an exactly minimax optimal method for (β,0)-Hölder smooth problems, methods with the best-known convergence guarantees up to constants for any (β,p)-Hölder smooth convex minimization, and a new universal fast backtracking method are derived.

What carries the argument

Interpolation theorems for the class of inexactly smooth convex functions

If this is right

Any first-order method on inexactly smooth problems can be analyzed by solving a convex PEP.
An exactly minimax optimal first-order method is obtained for (β,0)-Hölder smooth convex minimization.
Convergence guarantees that are best-known up to constants are achieved for general (β,p)-Hölder smooth problems.
A universal fast backtracking method is constructed that applies to all inexactly smooth convex problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The backtracking method may reduce the need for manual tuning of step sizes in practical convex optimization.
Similar interpolation techniques could be developed for other function classes such as weakly convex or nonsmooth problems.
The framework might allow comparison of method performance across different smoothness assumptions in a unified way.

Load-bearing premise

The interpolation theorems hold with only modest universal constants for all combinations of the special cases of inexactly smooth functions.

What would settle it

Finding an inexactly smooth function and a set of points where the interpolation bound is violated by a factor larger than the universal constant, or observing that the proposed optimal method does not achieve the predicted minimax rate on a concrete (β,0)-Hölder smooth instance.

Figures

Figures reproduced from arXiv: 2606.01505 by Aaron Zoll, Benjamin Grimmer.

**Figure 2.** Figure 2: The first plot compares the numerical output of the constructive approach [PITH_FULL_IMAGE:figures/full_fig_p019_2.png] view at source ↗

read the original abstract

We consider a general class of ``inexactly smooth'' convex functions, providing a universal model capturing as special cases $L$-smooth, $M$-Lipschitz, and H\"older smooth functions, and any combination thereof. Such functions possess a calculus closely following that of smooth functions. Our main results provide inexactly smooth functions with interpolation theorems that are necessary and sufficient up to modest universal constants. These enable analysis of first-order methods for any inexactly smooth convex problem class via solving convex Performance Estimation Problems (PEPs). Further, these enable the extension of Drori and Taylor's constructive approach to algorithm design. From this, we derive an exactly minimax optimal method for $(\beta,0)$-H\"older smooth problems, methods with the best-known convergence guarantees up to constants for any $(\beta,p)$-H\"older smooth convex minimization, and a new universal fast backtracking method for any inexactly smooth convex problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The inexactly smooth class and its near-tight interpolation theorems are the real addition here, but the exact minimax optimality claim for pure Holder cases sits uneasily with the constant-factor looseness.

read the letter

The paper's main contribution is a single model for convex functions that are inexactly smooth, covering L-smooth, M-Lipschitz, Holder, and their combinations. The interpolation theorems are stated as necessary and sufficient up to modest universal constants, which then lets them set up convex PEPs for any such class and extend the Drori-Taylor constructive design method. From that they produce a method they call exactly minimax optimal for (beta,0)-Holder problems, best-known rates (up to constants) for the general (beta,p) case, and a parameter-free fast backtracking scheme.

The unification itself is useful. Treating mixed smoothness conditions in one framework avoids having to switch analyses case by case, and the backtracking result looks like something that could be implemented directly. The citation pattern follows the standard PEP and Drori-Taylor line without obvious circularity.

The soft spot is the one the stress-test note flags. Interpolation only up to constants means the PEP upper bounds carry a multiplicative gap. Claiming an exactly minimax optimal method for the (beta,0) case therefore requires either that the constants equal 1 in that special case or an extra argument showing the gap cancels when matching the lower bound. The abstract does not indicate which route they take, so the proofs need to be checked on that point. If the constants remain strictly larger than 1 without compensation, the optimality statement would need to be qualified.

This is written for readers already working with performance estimation and worst-case analysis of first-order methods. Someone familiar with the smooth-case PEP literature will see the extension without much extra background. The work is coherent on its own terms and engages the relevant prior results, so it deserves a serious referee even if the exact-optimality part requires revision or clarification.

Referee Report

2 major / 1 minor

Summary. The paper introduces a general class of inexactly smooth convex functions that captures L-smooth, M-Lipschitz, and Hölder smooth functions (and combinations thereof) as special cases. It establishes interpolation theorems for this class that are necessary and sufficient up to modest universal constants. These theorems enable analysis of first-order methods on any such problem class via convex Performance Estimation Problems (PEPs) and extend Drori-Taylor's constructive algorithm design approach. From this framework the paper derives an exactly minimax optimal method for (β,0)-Hölder smooth problems, methods with the best-known guarantees (up to constants) for general (β,p)-Hölder smooth convex minimization, and a new universal fast backtracking method.

Significance. If the interpolation results hold with the stated constants, the work meaningfully extends the PEP methodology beyond the smooth case, providing a unified analysis tool for a broader family of convex problems and enabling new optimized first-order methods. The constructive derivation of an exactly minimax optimal method for the (β,0) case would be a notable advance if the constant-factor looseness is resolved without compromising exactness.

major comments (2)

[Abstract] Abstract: the interpolation theorems are claimed to be necessary and sufficient only 'up to modest universal constants,' yet the paper asserts derivation of an 'exactly minimax optimal method' for (β,0)-Hölder smooth problems. If any universal constant strictly exceeds 1, the PEP upper bounds inherit a multiplicative looseness; it is unclear how exact matching to minimax lower bounds is obtained without an explicit argument that the constants equal 1 in this case or that the looseness cancels exactly in the lower-bound comparison.
[Abstract] Abstract: the assertion that inexactly smooth functions 'possess a calculus closely following that of smooth functions' is load-bearing for the PEP extension to arbitrary combinations of L-smooth, M-Lipschitz, and Hölder smooth cases. The manuscript should supply a concrete verification (or counter-example) that the interpolation theorems remain valid under all such combinations, as this underpins the universality claim.

minor comments (1)

[Abstract] Abstract: the parameterization of Hölder smoothness as (β,p) is introduced without a brief parenthetical definition, which may hinder immediate readability for readers outside the subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the interpolation theorems are claimed to be necessary and sufficient only 'up to modest universal constants,' yet the paper asserts derivation of an 'exactly minimax optimal method' for (β,0)-Hölder smooth problems. If any universal constant strictly exceeds 1, the PEP upper bounds inherit a multiplicative looseness; it is unclear how exact matching to minimax lower bounds is obtained without an explicit argument that the constants equal 1 in this case or that the looseness cancels exactly in the lower-bound comparison.

Authors: For the specific case of (β,0)-Hölder smooth functions the interpolation theorem holds with constants exactly equal to 1 (see the statement and proof of Theorem 3.2). This is why the PEP yields an exactly minimax optimal method with no inherited looseness. We will revise the abstract and add a short clarifying sentence in Section 3 to make this distinction explicit. revision: yes
Referee: [Abstract] Abstract: the assertion that inexactly smooth functions 'possess a calculus closely following that of smooth functions' is load-bearing for the PEP extension to arbitrary combinations of L-smooth, M-Lipschitz, and Hölder smooth cases. The manuscript should supply a concrete verification (or counter-example) that the interpolation theorems remain valid under all such combinations, as this underpins the universality claim.

Authors: The interpolation theorems are stated for the general inexactly smooth class, which by definition includes arbitrary combinations of the listed properties. Nevertheless, we agree that an explicit verification for combined cases would strengthen the manuscript. We will add a short subsection (or appendix paragraph) providing a concrete verification for the joint (L,M,β,p) case. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation builds on independent interpolation theorems

full rationale

The paper establishes interpolation theorems for the inexactly smooth class that are necessary and sufficient up to modest universal constants; these theorems are presented as the main results and serve as the foundation for setting up convex PEPs and extending the Drori-Taylor design approach. No quoted steps reduce a claimed prediction or optimality result to a fitted parameter, self-definition, or load-bearing self-citation chain. The extension to an exactly minimax optimal method for the (β,0) case is derived from the PEP solution rather than being presupposed by the input class definition. The approach is self-contained against external benchmarks once the interpolation theorems are accepted, with no evidence of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no specific free parameters, axioms, or invented entities identifiable without full text.

pith-pipeline@v0.9.1-grok · 5682 in / 977 out tokens · 21410 ms · 2026-06-28T13:56:44.152148+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 12 canonical work pages · 1 internal anchor

[1]

Bauschke and P

H. Bauschke and P. Combettes.Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer Cham, 2017

2017
[2]

Boyd and L

S. Boyd and L. Vandenberghe.Convex Optimization. Cambridge University Press, Cambridge, UK, 2004

2004
[3]

S. Bubeck. Convex optimization: Algorithms and complexity.Foundations and Trends in Machine Learning, 8:231–357, 2015.doi:10.1561/2200000050

work page doi:10.1561/2200000050 2015
[4]

de Klerk, F

E. de Klerk, F. Glineur, and A. B. Taylor. On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions.Optimization Letters, 11(7):1185–1199, 2017

2017
[5]

de Klerk, F

E. de Klerk, F. Glineur, and A. B. Taylor. Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation.SIAM Journal on Optimization, 30(3):2053–2082, 2020

2053
[6]

Devolder, F

O. Devolder, F. Glineur, and Y. Nesterov. First-order methods of smooth convex optimization with inexact oracle.Mathematical Programming, 146:37–75, 2014

2014
[7]

Diakonikolas and C

J. Diakonikolas and C. Guzmán. Optimization on a finer scale: Bounded local subgradient variation perspective.SIAM Journal on Optimization, 36:152–184, 2026.doi:10.1137/24M1659510

work page doi:10.1137/24m1659510 2026
[8]

Y. Drori. The exact information-based complexity of smooth convex minimization.Journal of Complexity, 39:1–16, 2017.doi:10.1016/j.jco.2016.11.001

work page doi:10.1016/j.jco.2016.11.001 2017
[9]

Drori and A

Y. Drori and A. Taylor. Efficient first-order methods for convex minimization: a constructive approach. Mathematical Programming, 184:183–220, 2020

2020
[10]

Drori and M

Y. Drori and M. Teboulle. Performance of first-order methods for smooth convex minimization: a novel approach.Mathematical Programming, 145:451–482, 2014.doi:10.1007/s10107-013-0653-0

work page doi:10.1007/s10107-013-0653-0 2014
[11]

Drori and M

Y. Drori and M. Teboulle. An optimal variant of kelley’s cutting-plane method.Mathematical Program- ming, 160:321–351, 2016

2016
[12]

M. P. Friedlander, Goodwin A, and T. Hoheisel. From perspective maps to epigraphical projections. Mathematics of Operations Research, 48:1711–1740, 2022

2022
[13]

O. Gannot. A frequency-domain analysis of inexact gradient methods.Mathematical Programming, 194:975–1016, 2022

2022
[14]

Goujaud, C

B. Goujaud, C. Moucer, F. Glineur, J. Hendrickx, A. Taylor, and A. Dieuleveut. PEPit: computer- assisted worst-case analyses of first-order optimization methods in Python.Mathematical Programming Computation, 16:337–367, 2024

2024
[15]

B. Grimmer. On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 18(2):427–445, 2024.doi:10.1007/s11590-023-02060-2

work page doi:10.1007/s11590-023-02060-2 2024
[16]

Grimmer, K

B. Grimmer, K. Shu, and A. L. Wang. Beyond minimax optimality: a subgame perfect gradient method. Mathematical Programming, 2026. Published online

2026
[17]

Guigues, J

V. Guigues, J. Liang, and R. Monteiro. Universal subgradient and proximal bundle methods for convex and strongly convex hybrid composite optimization, 2025.arXiv:2407.10073

work page arXiv 2025
[18]

J. B. Hiriart-Urruty and C. Lemaréchal.Fundamentals of Convex Analysis. Springer Berlin, Heidelberg, 2001

2001
[19]

Hoheisel

T. Hoheisel. Topics in convex analysis in matrix space. Lecture Notes, Spring School on Variational Analysis, Paseky nad Jizerou, Czech Republic, May 2019. 22

2019
[20]

Kim and J

D. Kim and J. Fessler. Optimized first-order methods for smooth convex minimization.Mathematical Programming, 159:81–107, 2016.doi:10.1007/s10107-015-0949-3

work page doi:10.1007/s10107-015-0949-3 2016
[21]

Li and G

T. Li and G. Lan. A simple uniformly optimal method without line search for convex optimization. Mathematical Programming, 2025. Published online.doi:10.1007/s10107-025-02250-z

work page doi:10.1007/s10107-025-02250-z 2025
[22]

Nonasymptotic analysis of accelerated methods with inexact oracle under absolute error bound, 2025.arXiv:2408.00720

Yin Liu and Sam Davanloo Tajbakhsh. Nonasymptotic analysis of accelerated methods with inexact oracle under absolute error bound, 2025.arXiv:2408.00720

work page arXiv 2025
[23]

Nemirovski and Y

A. Nemirovski and Y. Nesterov. Optimal methods of smooth convex minimization.USSR Comput. Math. Math. Phys., 25:21–30, 1985. URL:ttps://doi.org/10.1016/0041-5553(85)90100-4

work page doi:10.1016/0041-5553(85)90100-4 1985
[24]

Nemirovski and D

A. Nemirovski and D. Yudin.Problem Complexity and Method Efficiency in Optimization. Wiley- Interscience Series in Discrete Mathematics. John Wiley & Sons, New York, 1983. Translated from the Russian by E. R. Dawson

1983
[25]

Nesterov.Introductory Lectures on Convex Optimization: A Basic Course, volume 87 ofApplied Optimization

Y. Nesterov.Introductory Lectures on Convex Optimization: A Basic Course, volume 87 ofApplied Optimization. Kluwer Academic Publishers, Boston, MA, 2004

2004
[26]

Nesterov

Y. Nesterov. Universal gradient methods for convex optimization problems.Mathematical Programming, 152:381–404, 2014.doi:10.1007/s10107-014-0790-0

work page doi:10.1007/s10107-014-0790-0 2014
[27]

Park and E

C. Park and E. Ryu. Optimal first-order algorithms as a function of inequalities.Journal of Machine Learning Research, 25:1–66, 2024. URL:http://jmlr.org/papers/v25/21-1256.html

2024
[28]

Rockafellar.Convex Analysis

R. Rockafellar.Convex Analysis. Princeton University Press, 1996

1996
[29]

Taylor, J

A. Taylor, J. Hendrickx, and F. Glineur. Exact worst-case performance of first-order methods for composite convex optimization.SIAM Journal on Optimization, 27:1283–1313, 2017

2017
[30]

Taylor, J

A. Taylor, J. Hendrickx, and F. Glineur. Smooth strongly convex interpolation and exact worst-case performance of first-order methods.Mathematical Programming, 161:307–345, 2017

2017
[31]

Y. Wu, Y. Ouyang, Z. Zhang, and Q. Luo. Universal and parameter-free gradient sliding for composite optimization, 2026. URL:https://arxiv.org/abs/2603.23492,arXiv:2603.23492

work page internal anchor Pith review Pith/arXiv arXiv 2026
[32]

PhD thesis, Université catholique de Louvain, 2025

Kamri Y.Contributions to Performance Estimation Problems for the Analysis and Design of First-Order Optimization Methods. PhD thesis, Université catholique de Louvain, 2025

2025
[33]

Zamani and F

M. Zamani and F. Glineur. Exact convergence rate of the last iterate in subgradient methods.SIAM Journal on Optimization, 35:2182–2201, 2025. A Deferred Proofs on Inexactly Smooth Calculus Proof of Theorem 2.4.(1.=⇒2.) Here we have that for allx,y with gx∈∂f(x), gy∈∂f(y) and allδ≥0 f(x)≥f(y) +⟨gy,x−y⟩+1 2L(δ)∥gy−gx∥2−δ. Writinggy =gx+(gy−gx)andapplyingYou...

2025
[34]

Noting that gx∈∂f(x)⇐⇒αgx∈∂(αf)(x), applying the change of variablesδ↦→δ/α in (1.4) and rearranging proves the claim
[35]

Therefore, with the change of variablesx↦→Ax−b, (1.4) becomes f(Ay−b)≤f(Ax−b) + ⣨ ATgAx−b,y−x ⟩ + ∥A∥2 opL(δ) 2 ∥y−x∥2 +δ

Note that for anygAx−b∈∂f(Ax−b), by the subgradient chain rule [1, Corollary 16.72], one hasATgAx−b∈∂(f◦(A(·)−b))(x). Therefore, with the change of variablesx↦→Ax−b, (1.4) becomes f(Ay−b)≤f(Ax−b) + ⣨ ATgAx−b,y−x ⟩ + ∥A∥2 opL(δ) 2 ∥y−x∥2 +δ
[36]

Then F(x) = f(x,zx)and F(y) = infzf(y,z )≤ f(y,zx)

For any fixedx, letzx∈argminzf(x,z ). Then F(x) = f(x,zx)and F(y) = infzf(y,z )≤ f(y,zx). Since f(x,z )is L(·)-inexactly smooth it holds for any(x,z ),( y,z′)and( gx,gz)∈ ∂f(x,z)and anyδ≥0that f(y,z′)≤f(x,z) +⟨gx,y−x⟩+ ⟨ gz,z′−z ⟩ + L(δ) 2 (∥x−y∥2 +∥z−z′∥2) +δ. From the subdifferential calculus rule [12, Theorem 1], we know that∂F(x) ={gx : (gx,0)∈ ∂f(x,z...
[37]

Taking the infimum over all selections ofδi≥0gives the claimed formula

For any choice ofδi≥0, summing each inequality in(A.1) with ∑m i=1δi =δand utilizing the sum rule [28, Theorem 23.8] establishes a suitable quadratic upper bound for the function∑m i=1fi. Taking the infimum over all selections ofδi≥0gives the claimed formula
[38]

By [1, Theorem 18.5], lettingI∗(gx) ={i:f∗(gx) =f∗ i (gx)}the subdifferential of this maximum is given by ∂f∗(gx) = conv ⋃ i∈I∗(gx) ∂f∗ i (gx)

Consider the function given by the inner maximumf∗(gx) = maxif∗ i (gx). By [1, Theorem 18.5], lettingI∗(gx) ={i:f∗(gx) =f∗ i (gx)}the subdifferential of this maximum is given by ∂f∗(gx) = conv ⋃ i∈I∗(gx) ∂f∗ i (gx). Fix any gx with x∈∂f∗(gx), which the above formula guarantees must take the form x= ∑ i∈I∗(gx)αixi withx i∈∂f∗ i (gx). Then for anygy, f∗(gy)...
[39]

Taking the supremum over choices ofδi≥0with ∑δi =δtightens this bound, from which the claimed inexact smoothness follows again by Theorem 2.4

Summing the bounds(A.2) with any selection ofδi≥0verifies the ∑1/Li(δi)-inexact strong convexity of∑f∗ i . Taking the supremum over choices ofδi≥0with ∑δi =δtightens this bound, from which the claimed inexact smoothness follows again by Theorem 2.4. Proof of Theorem 2.7.For anyx,y∈Rd, gx∈∂f(x), andδ≥0, it suffices to show that the followingS f function is...

[1] [1]

Bauschke and P

H. Bauschke and P. Combettes.Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer Cham, 2017

2017

[2] [2]

Boyd and L

S. Boyd and L. Vandenberghe.Convex Optimization. Cambridge University Press, Cambridge, UK, 2004

2004

[3] [3]

S. Bubeck. Convex optimization: Algorithms and complexity.Foundations and Trends in Machine Learning, 8:231–357, 2015.doi:10.1561/2200000050

work page doi:10.1561/2200000050 2015

[4] [4]

de Klerk, F

E. de Klerk, F. Glineur, and A. B. Taylor. On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions.Optimization Letters, 11(7):1185–1199, 2017

2017

[5] [5]

de Klerk, F

E. de Klerk, F. Glineur, and A. B. Taylor. Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation.SIAM Journal on Optimization, 30(3):2053–2082, 2020

2053

[6] [6]

Devolder, F

O. Devolder, F. Glineur, and Y. Nesterov. First-order methods of smooth convex optimization with inexact oracle.Mathematical Programming, 146:37–75, 2014

2014

[7] [7]

Diakonikolas and C

J. Diakonikolas and C. Guzmán. Optimization on a finer scale: Bounded local subgradient variation perspective.SIAM Journal on Optimization, 36:152–184, 2026.doi:10.1137/24M1659510

work page doi:10.1137/24m1659510 2026

[8] [8]

Y. Drori. The exact information-based complexity of smooth convex minimization.Journal of Complexity, 39:1–16, 2017.doi:10.1016/j.jco.2016.11.001

work page doi:10.1016/j.jco.2016.11.001 2017

[9] [9]

Drori and A

Y. Drori and A. Taylor. Efficient first-order methods for convex minimization: a constructive approach. Mathematical Programming, 184:183–220, 2020

2020

[10] [10]

Drori and M

Y. Drori and M. Teboulle. Performance of first-order methods for smooth convex minimization: a novel approach.Mathematical Programming, 145:451–482, 2014.doi:10.1007/s10107-013-0653-0

work page doi:10.1007/s10107-013-0653-0 2014

[11] [11]

Drori and M

Y. Drori and M. Teboulle. An optimal variant of kelley’s cutting-plane method.Mathematical Program- ming, 160:321–351, 2016

2016

[12] [12]

M. P. Friedlander, Goodwin A, and T. Hoheisel. From perspective maps to epigraphical projections. Mathematics of Operations Research, 48:1711–1740, 2022

2022

[13] [13]

O. Gannot. A frequency-domain analysis of inexact gradient methods.Mathematical Programming, 194:975–1016, 2022

2022

[14] [14]

Goujaud, C

B. Goujaud, C. Moucer, F. Glineur, J. Hendrickx, A. Taylor, and A. Dieuleveut. PEPit: computer- assisted worst-case analyses of first-order optimization methods in Python.Mathematical Programming Computation, 16:337–367, 2024

2024

[15] [15]

B. Grimmer. On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 18(2):427–445, 2024.doi:10.1007/s11590-023-02060-2

work page doi:10.1007/s11590-023-02060-2 2024

[16] [16]

Grimmer, K

B. Grimmer, K. Shu, and A. L. Wang. Beyond minimax optimality: a subgame perfect gradient method. Mathematical Programming, 2026. Published online

2026

[17] [17]

Guigues, J

V. Guigues, J. Liang, and R. Monteiro. Universal subgradient and proximal bundle methods for convex and strongly convex hybrid composite optimization, 2025.arXiv:2407.10073

work page arXiv 2025

[18] [18]

J. B. Hiriart-Urruty and C. Lemaréchal.Fundamentals of Convex Analysis. Springer Berlin, Heidelberg, 2001

2001

[19] [19]

Hoheisel

T. Hoheisel. Topics in convex analysis in matrix space. Lecture Notes, Spring School on Variational Analysis, Paseky nad Jizerou, Czech Republic, May 2019. 22

2019

[20] [20]

Kim and J

D. Kim and J. Fessler. Optimized first-order methods for smooth convex minimization.Mathematical Programming, 159:81–107, 2016.doi:10.1007/s10107-015-0949-3

work page doi:10.1007/s10107-015-0949-3 2016

[21] [21]

Li and G

T. Li and G. Lan. A simple uniformly optimal method without line search for convex optimization. Mathematical Programming, 2025. Published online.doi:10.1007/s10107-025-02250-z

work page doi:10.1007/s10107-025-02250-z 2025

[22] [22]

Nonasymptotic analysis of accelerated methods with inexact oracle under absolute error bound, 2025.arXiv:2408.00720

Yin Liu and Sam Davanloo Tajbakhsh. Nonasymptotic analysis of accelerated methods with inexact oracle under absolute error bound, 2025.arXiv:2408.00720

work page arXiv 2025

[23] [23]

Nemirovski and Y

A. Nemirovski and Y. Nesterov. Optimal methods of smooth convex minimization.USSR Comput. Math. Math. Phys., 25:21–30, 1985. URL:ttps://doi.org/10.1016/0041-5553(85)90100-4

work page doi:10.1016/0041-5553(85)90100-4 1985

[24] [24]

Nemirovski and D

A. Nemirovski and D. Yudin.Problem Complexity and Method Efficiency in Optimization. Wiley- Interscience Series in Discrete Mathematics. John Wiley & Sons, New York, 1983. Translated from the Russian by E. R. Dawson

1983

[25] [25]

Nesterov.Introductory Lectures on Convex Optimization: A Basic Course, volume 87 ofApplied Optimization

Y. Nesterov.Introductory Lectures on Convex Optimization: A Basic Course, volume 87 ofApplied Optimization. Kluwer Academic Publishers, Boston, MA, 2004

2004

[26] [26]

Nesterov

Y. Nesterov. Universal gradient methods for convex optimization problems.Mathematical Programming, 152:381–404, 2014.doi:10.1007/s10107-014-0790-0

work page doi:10.1007/s10107-014-0790-0 2014

[27] [27]

Park and E

C. Park and E. Ryu. Optimal first-order algorithms as a function of inequalities.Journal of Machine Learning Research, 25:1–66, 2024. URL:http://jmlr.org/papers/v25/21-1256.html

2024

[28] [28]

Rockafellar.Convex Analysis

R. Rockafellar.Convex Analysis. Princeton University Press, 1996

1996

[29] [29]

Taylor, J

A. Taylor, J. Hendrickx, and F. Glineur. Exact worst-case performance of first-order methods for composite convex optimization.SIAM Journal on Optimization, 27:1283–1313, 2017

2017

[30] [30]

Taylor, J

A. Taylor, J. Hendrickx, and F. Glineur. Smooth strongly convex interpolation and exact worst-case performance of first-order methods.Mathematical Programming, 161:307–345, 2017

2017

[31] [31]

Y. Wu, Y. Ouyang, Z. Zhang, and Q. Luo. Universal and parameter-free gradient sliding for composite optimization, 2026. URL:https://arxiv.org/abs/2603.23492,arXiv:2603.23492

work page internal anchor Pith review Pith/arXiv arXiv 2026

[32] [32]

PhD thesis, Université catholique de Louvain, 2025

Kamri Y.Contributions to Performance Estimation Problems for the Analysis and Design of First-Order Optimization Methods. PhD thesis, Université catholique de Louvain, 2025

2025

[33] [33]

Zamani and F

M. Zamani and F. Glineur. Exact convergence rate of the last iterate in subgradient methods.SIAM Journal on Optimization, 35:2182–2201, 2025. A Deferred Proofs on Inexactly Smooth Calculus Proof of Theorem 2.4.(1.=⇒2.) Here we have that for allx,y with gx∈∂f(x), gy∈∂f(y) and allδ≥0 f(x)≥f(y) +⟨gy,x−y⟩+1 2L(δ)∥gy−gx∥2−δ. Writinggy =gx+(gy−gx)andapplyingYou...

2025

[34] [34]

Noting that gx∈∂f(x)⇐⇒αgx∈∂(αf)(x), applying the change of variablesδ↦→δ/α in (1.4) and rearranging proves the claim

[35] [35]

Therefore, with the change of variablesx↦→Ax−b, (1.4) becomes f(Ay−b)≤f(Ax−b) + ⣨ ATgAx−b,y−x ⟩ + ∥A∥2 opL(δ) 2 ∥y−x∥2 +δ

Note that for anygAx−b∈∂f(Ax−b), by the subgradient chain rule [1, Corollary 16.72], one hasATgAx−b∈∂(f◦(A(·)−b))(x). Therefore, with the change of variablesx↦→Ax−b, (1.4) becomes f(Ay−b)≤f(Ax−b) + ⣨ ATgAx−b,y−x ⟩ + ∥A∥2 opL(δ) 2 ∥y−x∥2 +δ

[36] [36]

Then F(x) = f(x,zx)and F(y) = infzf(y,z )≤ f(y,zx)

For any fixedx, letzx∈argminzf(x,z ). Then F(x) = f(x,zx)and F(y) = infzf(y,z )≤ f(y,zx). Since f(x,z )is L(·)-inexactly smooth it holds for any(x,z ),( y,z′)and( gx,gz)∈ ∂f(x,z)and anyδ≥0that f(y,z′)≤f(x,z) +⟨gx,y−x⟩+ ⟨ gz,z′−z ⟩ + L(δ) 2 (∥x−y∥2 +∥z−z′∥2) +δ. From the subdifferential calculus rule [12, Theorem 1], we know that∂F(x) ={gx : (gx,0)∈ ∂f(x,z...

[37] [37]

Taking the infimum over all selections ofδi≥0gives the claimed formula

For any choice ofδi≥0, summing each inequality in(A.1) with ∑m i=1δi =δand utilizing the sum rule [28, Theorem 23.8] establishes a suitable quadratic upper bound for the function∑m i=1fi. Taking the infimum over all selections ofδi≥0gives the claimed formula

[38] [38]

By [1, Theorem 18.5], lettingI∗(gx) ={i:f∗(gx) =f∗ i (gx)}the subdifferential of this maximum is given by ∂f∗(gx) = conv ⋃ i∈I∗(gx) ∂f∗ i (gx)

Consider the function given by the inner maximumf∗(gx) = maxif∗ i (gx). By [1, Theorem 18.5], lettingI∗(gx) ={i:f∗(gx) =f∗ i (gx)}the subdifferential of this maximum is given by ∂f∗(gx) = conv ⋃ i∈I∗(gx) ∂f∗ i (gx). Fix any gx with x∈∂f∗(gx), which the above formula guarantees must take the form x= ∑ i∈I∗(gx)αixi withx i∈∂f∗ i (gx). Then for anygy, f∗(gy)...

[39] [39]

Taking the supremum over choices ofδi≥0with ∑δi =δtightens this bound, from which the claimed inexact smoothness follows again by Theorem 2.4

Summing the bounds(A.2) with any selection ofδi≥0verifies the ∑1/Li(δi)-inexact strong convexity of∑f∗ i . Taking the supremum over choices ofδi≥0with ∑δi =δtightens this bound, from which the claimed inexact smoothness follows again by Theorem 2.4. Proof of Theorem 2.7.For anyx,y∈Rd, gx∈∂f(x), andδ≥0, it suffices to show that the followingS f function is...