Inexactly Smooth Performance Estimation and New Optimized Gradient Methods
Pith reviewed 2026-06-28 13:56 UTC · model grok-4.3
The pith
Inexactly smooth convex functions have nearly tight interpolation theorems that support optimal first-order method construction.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Inexactly smooth convex functions possess interpolation theorems that are necessary and sufficient up to modest universal constants. These enable analysis of first-order methods for any inexactly smooth convex problem class via solving convex Performance Estimation Problems (PEPs) and the extension of Drori and Taylor's constructive approach to algorithm design, from which an exactly minimax optimal method for (β,0)-Hölder smooth problems, methods with the best-known convergence guarantees up to constants for any (β,p)-Hölder smooth convex minimization, and a new universal fast backtracking method are derived.
What carries the argument
Interpolation theorems for the class of inexactly smooth convex functions
If this is right
- Any first-order method on inexactly smooth problems can be analyzed by solving a convex PEP.
- An exactly minimax optimal first-order method is obtained for (β,0)-Hölder smooth convex minimization.
- Convergence guarantees that are best-known up to constants are achieved for general (β,p)-Hölder smooth problems.
- A universal fast backtracking method is constructed that applies to all inexactly smooth convex problems.
Where Pith is reading between the lines
- The backtracking method may reduce the need for manual tuning of step sizes in practical convex optimization.
- Similar interpolation techniques could be developed for other function classes such as weakly convex or nonsmooth problems.
- The framework might allow comparison of method performance across different smoothness assumptions in a unified way.
Load-bearing premise
The interpolation theorems hold with only modest universal constants for all combinations of the special cases of inexactly smooth functions.
What would settle it
Finding an inexactly smooth function and a set of points where the interpolation bound is violated by a factor larger than the universal constant, or observing that the proposed optimal method does not achieve the predicted minimax rate on a concrete (β,0)-Hölder smooth instance.
Figures
read the original abstract
We consider a general class of ``inexactly smooth'' convex functions, providing a universal model capturing as special cases $L$-smooth, $M$-Lipschitz, and H\"older smooth functions, and any combination thereof. Such functions possess a calculus closely following that of smooth functions. Our main results provide inexactly smooth functions with interpolation theorems that are necessary and sufficient up to modest universal constants. These enable analysis of first-order methods for any inexactly smooth convex problem class via solving convex Performance Estimation Problems (PEPs). Further, these enable the extension of Drori and Taylor's constructive approach to algorithm design. From this, we derive an exactly minimax optimal method for $(\beta,0)$-H\"older smooth problems, methods with the best-known convergence guarantees up to constants for any $(\beta,p)$-H\"older smooth convex minimization, and a new universal fast backtracking method for any inexactly smooth convex problem.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a general class of inexactly smooth convex functions that captures L-smooth, M-Lipschitz, and Hölder smooth functions (and combinations thereof) as special cases. It establishes interpolation theorems for this class that are necessary and sufficient up to modest universal constants. These theorems enable analysis of first-order methods on any such problem class via convex Performance Estimation Problems (PEPs) and extend Drori-Taylor's constructive algorithm design approach. From this framework the paper derives an exactly minimax optimal method for (β,0)-Hölder smooth problems, methods with the best-known guarantees (up to constants) for general (β,p)-Hölder smooth convex minimization, and a new universal fast backtracking method.
Significance. If the interpolation results hold with the stated constants, the work meaningfully extends the PEP methodology beyond the smooth case, providing a unified analysis tool for a broader family of convex problems and enabling new optimized first-order methods. The constructive derivation of an exactly minimax optimal method for the (β,0) case would be a notable advance if the constant-factor looseness is resolved without compromising exactness.
major comments (2)
- [Abstract] Abstract: the interpolation theorems are claimed to be necessary and sufficient only 'up to modest universal constants,' yet the paper asserts derivation of an 'exactly minimax optimal method' for (β,0)-Hölder smooth problems. If any universal constant strictly exceeds 1, the PEP upper bounds inherit a multiplicative looseness; it is unclear how exact matching to minimax lower bounds is obtained without an explicit argument that the constants equal 1 in this case or that the looseness cancels exactly in the lower-bound comparison.
- [Abstract] Abstract: the assertion that inexactly smooth functions 'possess a calculus closely following that of smooth functions' is load-bearing for the PEP extension to arbitrary combinations of L-smooth, M-Lipschitz, and Hölder smooth cases. The manuscript should supply a concrete verification (or counter-example) that the interpolation theorems remain valid under all such combinations, as this underpins the universality claim.
minor comments (1)
- [Abstract] Abstract: the parameterization of Hölder smoothness as (β,p) is introduced without a brief parenthetical definition, which may hinder immediate readability for readers outside the subfield.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive feedback. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the interpolation theorems are claimed to be necessary and sufficient only 'up to modest universal constants,' yet the paper asserts derivation of an 'exactly minimax optimal method' for (β,0)-Hölder smooth problems. If any universal constant strictly exceeds 1, the PEP upper bounds inherit a multiplicative looseness; it is unclear how exact matching to minimax lower bounds is obtained without an explicit argument that the constants equal 1 in this case or that the looseness cancels exactly in the lower-bound comparison.
Authors: For the specific case of (β,0)-Hölder smooth functions the interpolation theorem holds with constants exactly equal to 1 (see the statement and proof of Theorem 3.2). This is why the PEP yields an exactly minimax optimal method with no inherited looseness. We will revise the abstract and add a short clarifying sentence in Section 3 to make this distinction explicit. revision: yes
-
Referee: [Abstract] Abstract: the assertion that inexactly smooth functions 'possess a calculus closely following that of smooth functions' is load-bearing for the PEP extension to arbitrary combinations of L-smooth, M-Lipschitz, and Hölder smooth cases. The manuscript should supply a concrete verification (or counter-example) that the interpolation theorems remain valid under all such combinations, as this underpins the universality claim.
Authors: The interpolation theorems are stated for the general inexactly smooth class, which by definition includes arbitrary combinations of the listed properties. Nevertheless, we agree that an explicit verification for combined cases would strengthen the manuscript. We will add a short subsection (or appendix paragraph) providing a concrete verification for the joint (L,M,β,p) case. revision: yes
Circularity Check
No significant circularity; derivation builds on independent interpolation theorems
full rationale
The paper establishes interpolation theorems for the inexactly smooth class that are necessary and sufficient up to modest universal constants; these theorems are presented as the main results and serve as the foundation for setting up convex PEPs and extending the Drori-Taylor design approach. No quoted steps reduce a claimed prediction or optimality result to a fitted parameter, self-definition, or load-bearing self-citation chain. The extension to an exactly minimax optimal method for the (β,0) case is derived from the PEP solution rather than being presupposed by the input class definition. The approach is self-contained against external benchmarks once the interpolation theorems are accepted, with no evidence of the enumerated circular patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Bauschke and P
H. Bauschke and P. Combettes.Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer Cham, 2017
2017
-
[2]
Boyd and L
S. Boyd and L. Vandenberghe.Convex Optimization. Cambridge University Press, Cambridge, UK, 2004
2004
-
[3]
S. Bubeck. Convex optimization: Algorithms and complexity.Foundations and Trends in Machine Learning, 8:231–357, 2015.doi:10.1561/2200000050
-
[4]
de Klerk, F
E. de Klerk, F. Glineur, and A. B. Taylor. On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions.Optimization Letters, 11(7):1185–1199, 2017
2017
-
[5]
de Klerk, F
E. de Klerk, F. Glineur, and A. B. Taylor. Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation.SIAM Journal on Optimization, 30(3):2053–2082, 2020
2053
-
[6]
Devolder, F
O. Devolder, F. Glineur, and Y. Nesterov. First-order methods of smooth convex optimization with inexact oracle.Mathematical Programming, 146:37–75, 2014
2014
-
[7]
J. Diakonikolas and C. Guzmán. Optimization on a finer scale: Bounded local subgradient variation perspective.SIAM Journal on Optimization, 36:152–184, 2026.doi:10.1137/24M1659510
-
[8]
Y. Drori. The exact information-based complexity of smooth convex minimization.Journal of Complexity, 39:1–16, 2017.doi:10.1016/j.jco.2016.11.001
-
[9]
Drori and A
Y. Drori and A. Taylor. Efficient first-order methods for convex minimization: a constructive approach. Mathematical Programming, 184:183–220, 2020
2020
-
[10]
Y. Drori and M. Teboulle. Performance of first-order methods for smooth convex minimization: a novel approach.Mathematical Programming, 145:451–482, 2014.doi:10.1007/s10107-013-0653-0
-
[11]
Drori and M
Y. Drori and M. Teboulle. An optimal variant of kelley’s cutting-plane method.Mathematical Program- ming, 160:321–351, 2016
2016
-
[12]
M. P. Friedlander, Goodwin A, and T. Hoheisel. From perspective maps to epigraphical projections. Mathematics of Operations Research, 48:1711–1740, 2022
2022
-
[13]
O. Gannot. A frequency-domain analysis of inexact gradient methods.Mathematical Programming, 194:975–1016, 2022
2022
-
[14]
Goujaud, C
B. Goujaud, C. Moucer, F. Glineur, J. Hendrickx, A. Taylor, and A. Dieuleveut. PEPit: computer- assisted worst-case analyses of first-order optimization methods in Python.Mathematical Programming Computation, 16:337–367, 2024
2024
-
[15]
B. Grimmer. On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 18(2):427–445, 2024.doi:10.1007/s11590-023-02060-2
-
[16]
Grimmer, K
B. Grimmer, K. Shu, and A. L. Wang. Beyond minimax optimality: a subgame perfect gradient method. Mathematical Programming, 2026. Published online
2026
-
[17]
V. Guigues, J. Liang, and R. Monteiro. Universal subgradient and proximal bundle methods for convex and strongly convex hybrid composite optimization, 2025.arXiv:2407.10073
-
[18]
J. B. Hiriart-Urruty and C. Lemaréchal.Fundamentals of Convex Analysis. Springer Berlin, Heidelberg, 2001
2001
-
[19]
Hoheisel
T. Hoheisel. Topics in convex analysis in matrix space. Lecture Notes, Spring School on Variational Analysis, Paseky nad Jizerou, Czech Republic, May 2019. 22
2019
-
[20]
D. Kim and J. Fessler. Optimized first-order methods for smooth convex minimization.Mathematical Programming, 159:81–107, 2016.doi:10.1007/s10107-015-0949-3
-
[21]
T. Li and G. Lan. A simple uniformly optimal method without line search for convex optimization. Mathematical Programming, 2025. Published online.doi:10.1007/s10107-025-02250-z
-
[22]
Yin Liu and Sam Davanloo Tajbakhsh. Nonasymptotic analysis of accelerated methods with inexact oracle under absolute error bound, 2025.arXiv:2408.00720
-
[23]
A. Nemirovski and Y. Nesterov. Optimal methods of smooth convex minimization.USSR Comput. Math. Math. Phys., 25:21–30, 1985. URL:ttps://doi.org/10.1016/0041-5553(85)90100-4
-
[24]
Nemirovski and D
A. Nemirovski and D. Yudin.Problem Complexity and Method Efficiency in Optimization. Wiley- Interscience Series in Discrete Mathematics. John Wiley & Sons, New York, 1983. Translated from the Russian by E. R. Dawson
1983
-
[25]
Nesterov.Introductory Lectures on Convex Optimization: A Basic Course, volume 87 ofApplied Optimization
Y. Nesterov.Introductory Lectures on Convex Optimization: A Basic Course, volume 87 ofApplied Optimization. Kluwer Academic Publishers, Boston, MA, 2004
2004
-
[26]
Y. Nesterov. Universal gradient methods for convex optimization problems.Mathematical Programming, 152:381–404, 2014.doi:10.1007/s10107-014-0790-0
-
[27]
Park and E
C. Park and E. Ryu. Optimal first-order algorithms as a function of inequalities.Journal of Machine Learning Research, 25:1–66, 2024. URL:http://jmlr.org/papers/v25/21-1256.html
2024
-
[28]
Rockafellar.Convex Analysis
R. Rockafellar.Convex Analysis. Princeton University Press, 1996
1996
-
[29]
Taylor, J
A. Taylor, J. Hendrickx, and F. Glineur. Exact worst-case performance of first-order methods for composite convex optimization.SIAM Journal on Optimization, 27:1283–1313, 2017
2017
-
[30]
Taylor, J
A. Taylor, J. Hendrickx, and F. Glineur. Smooth strongly convex interpolation and exact worst-case performance of first-order methods.Mathematical Programming, 161:307–345, 2017
2017
-
[31]
Y. Wu, Y. Ouyang, Z. Zhang, and Q. Luo. Universal and parameter-free gradient sliding for composite optimization, 2026. URL:https://arxiv.org/abs/2603.23492,arXiv:2603.23492
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[32]
PhD thesis, Université catholique de Louvain, 2025
Kamri Y.Contributions to Performance Estimation Problems for the Analysis and Design of First-Order Optimization Methods. PhD thesis, Université catholique de Louvain, 2025
2025
-
[33]
Zamani and F
M. Zamani and F. Glineur. Exact convergence rate of the last iterate in subgradient methods.SIAM Journal on Optimization, 35:2182–2201, 2025. A Deferred Proofs on Inexactly Smooth Calculus Proof of Theorem 2.4.(1.=⇒2.) Here we have that for allx,y with gx∈∂f(x), gy∈∂f(y) and allδ≥0 f(x)≥f(y) +⟨gy,x−y⟩+1 2L(δ)∥gy−gx∥2−δ. Writinggy =gx+(gy−gx)andapplyingYou...
2025
-
[34]
Noting that gx∈∂f(x)⇐⇒αgx∈∂(αf)(x), applying the change of variablesδ↦→δ/α in (1.4) and rearranging proves the claim
-
[35]
Therefore, with the change of variablesx↦→Ax−b, (1.4) becomes f(Ay−b)≤f(Ax−b) + ⣨ ATgAx−b,y−x ⟩ + ∥A∥2 opL(δ) 2 ∥y−x∥2 +δ
Note that for anygAx−b∈∂f(Ax−b), by the subgradient chain rule [1, Corollary 16.72], one hasATgAx−b∈∂(f◦(A(·)−b))(x). Therefore, with the change of variablesx↦→Ax−b, (1.4) becomes f(Ay−b)≤f(Ax−b) + ⣨ ATgAx−b,y−x ⟩ + ∥A∥2 opL(δ) 2 ∥y−x∥2 +δ
-
[36]
Then F(x) = f(x,zx)and F(y) = infzf(y,z )≤ f(y,zx)
For any fixedx, letzx∈argminzf(x,z ). Then F(x) = f(x,zx)and F(y) = infzf(y,z )≤ f(y,zx). Since f(x,z )is L(·)-inexactly smooth it holds for any(x,z ),( y,z′)and( gx,gz)∈ ∂f(x,z)and anyδ≥0that f(y,z′)≤f(x,z) +⟨gx,y−x⟩+ ⟨ gz,z′−z ⟩ + L(δ) 2 (∥x−y∥2 +∥z−z′∥2) +δ. From the subdifferential calculus rule [12, Theorem 1], we know that∂F(x) ={gx : (gx,0)∈ ∂f(x,z...
-
[37]
Taking the infimum over all selections ofδi≥0gives the claimed formula
For any choice ofδi≥0, summing each inequality in(A.1) with ∑m i=1δi =δand utilizing the sum rule [28, Theorem 23.8] establishes a suitable quadratic upper bound for the function∑m i=1fi. Taking the infimum over all selections ofδi≥0gives the claimed formula
-
[38]
By [1, Theorem 18.5], lettingI∗(gx) ={i:f∗(gx) =f∗ i (gx)}the subdifferential of this maximum is given by ∂f∗(gx) = conv ⋃ i∈I∗(gx) ∂f∗ i (gx)
Consider the function given by the inner maximumf∗(gx) = maxif∗ i (gx). By [1, Theorem 18.5], lettingI∗(gx) ={i:f∗(gx) =f∗ i (gx)}the subdifferential of this maximum is given by ∂f∗(gx) = conv ⋃ i∈I∗(gx) ∂f∗ i (gx). Fix any gx with x∈∂f∗(gx), which the above formula guarantees must take the form x= ∑ i∈I∗(gx)αixi withx i∈∂f∗ i (gx). Then for anygy, f∗(gy)...
-
[39]
Taking the supremum over choices ofδi≥0with ∑δi =δtightens this bound, from which the claimed inexact smoothness follows again by Theorem 2.4
Summing the bounds(A.2) with any selection ofδi≥0verifies the ∑1/Li(δi)-inexact strong convexity of∑f∗ i . Taking the supremum over choices ofδi≥0with ∑δi =δtightens this bound, from which the claimed inexact smoothness follows again by Theorem 2.4. Proof of Theorem 2.7.For anyx,y∈Rd, gx∈∂f(x), andδ≥0, it suffices to show that the followingS f function is...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.