pith. sign in

arxiv: 2606.01505 · v2 · pith:EK5XPFHFnew · submitted 2026-06-01 · 🧮 math.OC

Inexactly Smooth Performance Estimation and New Optimized Gradient Methods

Pith reviewed 2026-06-28 13:56 UTC · model grok-4.3

classification 🧮 math.OC
keywords inexactly smoothperformance estimationHölder smoothconvex optimizationfirst-order methodsinterpolation theoremsgradient methodsbacktracking
0
0 comments X

The pith

Inexactly smooth convex functions have nearly tight interpolation theorems that support optimal first-order method construction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines a broad class of inexactly smooth convex functions that generalizes several standard smoothness notions including L-smooth, M-Lipschitz, and Hölder smooth cases. It shows these functions satisfy interpolation theorems that are necessary and sufficient up to small universal factors. This allows any first-order method to be analyzed exactly by solving a convex performance estimation problem. The same property supports constructing new methods that achieve optimal or best-known rates for Hölder smooth problems and a backtracking method that works without knowing the smoothness parameters in advance.

Core claim

Inexactly smooth convex functions possess interpolation theorems that are necessary and sufficient up to modest universal constants. These enable analysis of first-order methods for any inexactly smooth convex problem class via solving convex Performance Estimation Problems (PEPs) and the extension of Drori and Taylor's constructive approach to algorithm design, from which an exactly minimax optimal method for (β,0)-Hölder smooth problems, methods with the best-known convergence guarantees up to constants for any (β,p)-Hölder smooth convex minimization, and a new universal fast backtracking method are derived.

What carries the argument

Interpolation theorems for the class of inexactly smooth convex functions

If this is right

  • Any first-order method on inexactly smooth problems can be analyzed by solving a convex PEP.
  • An exactly minimax optimal first-order method is obtained for (β,0)-Hölder smooth convex minimization.
  • Convergence guarantees that are best-known up to constants are achieved for general (β,p)-Hölder smooth problems.
  • A universal fast backtracking method is constructed that applies to all inexactly smooth convex problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The backtracking method may reduce the need for manual tuning of step sizes in practical convex optimization.
  • Similar interpolation techniques could be developed for other function classes such as weakly convex or nonsmooth problems.
  • The framework might allow comparison of method performance across different smoothness assumptions in a unified way.

Load-bearing premise

The interpolation theorems hold with only modest universal constants for all combinations of the special cases of inexactly smooth functions.

What would settle it

Finding an inexactly smooth function and a set of points where the interpolation bound is violated by a factor larger than the universal constant, or observing that the proposed optimal method does not achieve the predicted minimax rate on a concrete (β,0)-Hölder smooth instance.

Figures

Figures reproduced from arXiv: 2606.01505 by Aaron Zoll, Benjamin Grimmer.

Figure 1
Figure 1. Figure 1: Suboptimality of the subgradient method [3, Section 3.1] on [PITH_FULL_IMAGE:figures/full_fig_p018_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The first plot compares the numerical output of the constructive approach [PITH_FULL_IMAGE:figures/full_fig_p019_2.png] view at source ↗
read the original abstract

We consider a general class of ``inexactly smooth'' convex functions, providing a universal model capturing as special cases $L$-smooth, $M$-Lipschitz, and H\"older smooth functions, and any combination thereof. Such functions possess a calculus closely following that of smooth functions. Our main results provide inexactly smooth functions with interpolation theorems that are necessary and sufficient up to modest universal constants. These enable analysis of first-order methods for any inexactly smooth convex problem class via solving convex Performance Estimation Problems (PEPs). Further, these enable the extension of Drori and Taylor's constructive approach to algorithm design. From this, we derive an exactly minimax optimal method for $(\beta,0)$-H\"older smooth problems, methods with the best-known convergence guarantees up to constants for any $(\beta,p)$-H\"older smooth convex minimization, and a new universal fast backtracking method for any inexactly smooth convex problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces a general class of inexactly smooth convex functions that captures L-smooth, M-Lipschitz, and Hölder smooth functions (and combinations thereof) as special cases. It establishes interpolation theorems for this class that are necessary and sufficient up to modest universal constants. These theorems enable analysis of first-order methods on any such problem class via convex Performance Estimation Problems (PEPs) and extend Drori-Taylor's constructive algorithm design approach. From this framework the paper derives an exactly minimax optimal method for (β,0)-Hölder smooth problems, methods with the best-known guarantees (up to constants) for general (β,p)-Hölder smooth convex minimization, and a new universal fast backtracking method.

Significance. If the interpolation results hold with the stated constants, the work meaningfully extends the PEP methodology beyond the smooth case, providing a unified analysis tool for a broader family of convex problems and enabling new optimized first-order methods. The constructive derivation of an exactly minimax optimal method for the (β,0) case would be a notable advance if the constant-factor looseness is resolved without compromising exactness.

major comments (2)
  1. [Abstract] Abstract: the interpolation theorems are claimed to be necessary and sufficient only 'up to modest universal constants,' yet the paper asserts derivation of an 'exactly minimax optimal method' for (β,0)-Hölder smooth problems. If any universal constant strictly exceeds 1, the PEP upper bounds inherit a multiplicative looseness; it is unclear how exact matching to minimax lower bounds is obtained without an explicit argument that the constants equal 1 in this case or that the looseness cancels exactly in the lower-bound comparison.
  2. [Abstract] Abstract: the assertion that inexactly smooth functions 'possess a calculus closely following that of smooth functions' is load-bearing for the PEP extension to arbitrary combinations of L-smooth, M-Lipschitz, and Hölder smooth cases. The manuscript should supply a concrete verification (or counter-example) that the interpolation theorems remain valid under all such combinations, as this underpins the universality claim.
minor comments (1)
  1. [Abstract] Abstract: the parameterization of Hölder smoothness as (β,p) is introduced without a brief parenthetical definition, which may hinder immediate readability for readers outside the subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the interpolation theorems are claimed to be necessary and sufficient only 'up to modest universal constants,' yet the paper asserts derivation of an 'exactly minimax optimal method' for (β,0)-Hölder smooth problems. If any universal constant strictly exceeds 1, the PEP upper bounds inherit a multiplicative looseness; it is unclear how exact matching to minimax lower bounds is obtained without an explicit argument that the constants equal 1 in this case or that the looseness cancels exactly in the lower-bound comparison.

    Authors: For the specific case of (β,0)-Hölder smooth functions the interpolation theorem holds with constants exactly equal to 1 (see the statement and proof of Theorem 3.2). This is why the PEP yields an exactly minimax optimal method with no inherited looseness. We will revise the abstract and add a short clarifying sentence in Section 3 to make this distinction explicit. revision: yes

  2. Referee: [Abstract] Abstract: the assertion that inexactly smooth functions 'possess a calculus closely following that of smooth functions' is load-bearing for the PEP extension to arbitrary combinations of L-smooth, M-Lipschitz, and Hölder smooth cases. The manuscript should supply a concrete verification (or counter-example) that the interpolation theorems remain valid under all such combinations, as this underpins the universality claim.

    Authors: The interpolation theorems are stated for the general inexactly smooth class, which by definition includes arbitrary combinations of the listed properties. Nevertheless, we agree that an explicit verification for combined cases would strengthen the manuscript. We will add a short subsection (or appendix paragraph) providing a concrete verification for the joint (L,M,β,p) case. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation builds on independent interpolation theorems

full rationale

The paper establishes interpolation theorems for the inexactly smooth class that are necessary and sufficient up to modest universal constants; these theorems are presented as the main results and serve as the foundation for setting up convex PEPs and extending the Drori-Taylor design approach. No quoted steps reduce a claimed prediction or optimality result to a fitted parameter, self-definition, or load-bearing self-citation chain. The extension to an exactly minimax optimal method for the (β,0) case is derived from the PEP solution rather than being presupposed by the input class definition. The approach is self-contained against external benchmarks once the interpolation theorems are accepted, with no evidence of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no specific free parameters, axioms, or invented entities identifiable without full text.

pith-pipeline@v0.9.1-grok · 5682 in / 977 out tokens · 21410 ms · 2026-06-28T13:56:44.152148+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 12 canonical work pages · 1 internal anchor

  1. [1]

    Bauschke and P

    H. Bauschke and P. Combettes.Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer Cham, 2017

  2. [2]

    Boyd and L

    S. Boyd and L. Vandenberghe.Convex Optimization. Cambridge University Press, Cambridge, UK, 2004

  3. [3]

    S. Bubeck. Convex optimization: Algorithms and complexity.Foundations and Trends in Machine Learning, 8:231–357, 2015.doi:10.1561/2200000050

  4. [4]

    de Klerk, F

    E. de Klerk, F. Glineur, and A. B. Taylor. On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions.Optimization Letters, 11(7):1185–1199, 2017

  5. [5]

    de Klerk, F

    E. de Klerk, F. Glineur, and A. B. Taylor. Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation.SIAM Journal on Optimization, 30(3):2053–2082, 2020

  6. [6]

    Devolder, F

    O. Devolder, F. Glineur, and Y. Nesterov. First-order methods of smooth convex optimization with inexact oracle.Mathematical Programming, 146:37–75, 2014

  7. [7]

    Diakonikolas and C

    J. Diakonikolas and C. Guzmán. Optimization on a finer scale: Bounded local subgradient variation perspective.SIAM Journal on Optimization, 36:152–184, 2026.doi:10.1137/24M1659510

  8. [8]

    Y. Drori. The exact information-based complexity of smooth convex minimization.Journal of Complexity, 39:1–16, 2017.doi:10.1016/j.jco.2016.11.001

  9. [9]

    Drori and A

    Y. Drori and A. Taylor. Efficient first-order methods for convex minimization: a constructive approach. Mathematical Programming, 184:183–220, 2020

  10. [10]

    Drori and M

    Y. Drori and M. Teboulle. Performance of first-order methods for smooth convex minimization: a novel approach.Mathematical Programming, 145:451–482, 2014.doi:10.1007/s10107-013-0653-0

  11. [11]

    Drori and M

    Y. Drori and M. Teboulle. An optimal variant of kelley’s cutting-plane method.Mathematical Program- ming, 160:321–351, 2016

  12. [12]

    M. P. Friedlander, Goodwin A, and T. Hoheisel. From perspective maps to epigraphical projections. Mathematics of Operations Research, 48:1711–1740, 2022

  13. [13]

    O. Gannot. A frequency-domain analysis of inexact gradient methods.Mathematical Programming, 194:975–1016, 2022

  14. [14]

    Goujaud, C

    B. Goujaud, C. Moucer, F. Glineur, J. Hendrickx, A. Taylor, and A. Dieuleveut. PEPit: computer- assisted worst-case analyses of first-order optimization methods in Python.Mathematical Programming Computation, 16:337–367, 2024

  15. [15]

    B. Grimmer. On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 18(2):427–445, 2024.doi:10.1007/s11590-023-02060-2

  16. [16]

    Grimmer, K

    B. Grimmer, K. Shu, and A. L. Wang. Beyond minimax optimality: a subgame perfect gradient method. Mathematical Programming, 2026. Published online

  17. [17]

    Guigues, J

    V. Guigues, J. Liang, and R. Monteiro. Universal subgradient and proximal bundle methods for convex and strongly convex hybrid composite optimization, 2025.arXiv:2407.10073

  18. [18]

    J. B. Hiriart-Urruty and C. Lemaréchal.Fundamentals of Convex Analysis. Springer Berlin, Heidelberg, 2001

  19. [19]

    Hoheisel

    T. Hoheisel. Topics in convex analysis in matrix space. Lecture Notes, Spring School on Variational Analysis, Paseky nad Jizerou, Czech Republic, May 2019. 22

  20. [20]

    Kim and J

    D. Kim and J. Fessler. Optimized first-order methods for smooth convex minimization.Mathematical Programming, 159:81–107, 2016.doi:10.1007/s10107-015-0949-3

  21. [21]

    Li and G

    T. Li and G. Lan. A simple uniformly optimal method without line search for convex optimization. Mathematical Programming, 2025. Published online.doi:10.1007/s10107-025-02250-z

  22. [22]

    Nonasymptotic analysis of accelerated methods with inexact oracle under absolute error bound, 2025.arXiv:2408.00720

    Yin Liu and Sam Davanloo Tajbakhsh. Nonasymptotic analysis of accelerated methods with inexact oracle under absolute error bound, 2025.arXiv:2408.00720

  23. [23]

    Nemirovski and Y

    A. Nemirovski and Y. Nesterov. Optimal methods of smooth convex minimization.USSR Comput. Math. Math. Phys., 25:21–30, 1985. URL:ttps://doi.org/10.1016/0041-5553(85)90100-4

  24. [24]

    Nemirovski and D

    A. Nemirovski and D. Yudin.Problem Complexity and Method Efficiency in Optimization. Wiley- Interscience Series in Discrete Mathematics. John Wiley & Sons, New York, 1983. Translated from the Russian by E. R. Dawson

  25. [25]

    Nesterov.Introductory Lectures on Convex Optimization: A Basic Course, volume 87 ofApplied Optimization

    Y. Nesterov.Introductory Lectures on Convex Optimization: A Basic Course, volume 87 ofApplied Optimization. Kluwer Academic Publishers, Boston, MA, 2004

  26. [26]

    Nesterov

    Y. Nesterov. Universal gradient methods for convex optimization problems.Mathematical Programming, 152:381–404, 2014.doi:10.1007/s10107-014-0790-0

  27. [27]

    Park and E

    C. Park and E. Ryu. Optimal first-order algorithms as a function of inequalities.Journal of Machine Learning Research, 25:1–66, 2024. URL:http://jmlr.org/papers/v25/21-1256.html

  28. [28]

    Rockafellar.Convex Analysis

    R. Rockafellar.Convex Analysis. Princeton University Press, 1996

  29. [29]

    Taylor, J

    A. Taylor, J. Hendrickx, and F. Glineur. Exact worst-case performance of first-order methods for composite convex optimization.SIAM Journal on Optimization, 27:1283–1313, 2017

  30. [30]

    Taylor, J

    A. Taylor, J. Hendrickx, and F. Glineur. Smooth strongly convex interpolation and exact worst-case performance of first-order methods.Mathematical Programming, 161:307–345, 2017

  31. [31]

    Y. Wu, Y. Ouyang, Z. Zhang, and Q. Luo. Universal and parameter-free gradient sliding for composite optimization, 2026. URL:https://arxiv.org/abs/2603.23492,arXiv:2603.23492

  32. [32]

    PhD thesis, Université catholique de Louvain, 2025

    Kamri Y.Contributions to Performance Estimation Problems for the Analysis and Design of First-Order Optimization Methods. PhD thesis, Université catholique de Louvain, 2025

  33. [33]

    Zamani and F

    M. Zamani and F. Glineur. Exact convergence rate of the last iterate in subgradient methods.SIAM Journal on Optimization, 35:2182–2201, 2025. A Deferred Proofs on Inexactly Smooth Calculus Proof of Theorem 2.4.(1.=⇒2.) Here we have that for allx,y with gx∈∂f(x), gy∈∂f(y) and allδ≥0 f(x)≥f(y) +⟨gy,x−y⟩+1 2L(δ)∥gy−gx∥2−δ. Writinggy =gx+(gy−gx)andapplyingYou...

  34. [34]

    Noting that gx∈∂f(x)⇐⇒αgx∈∂(αf)(x), applying the change of variablesδ↦→δ/α in (1.4) and rearranging proves the claim

  35. [35]

    Therefore, with the change of variablesx↦→Ax−b, (1.4) becomes f(Ay−b)≤f(Ax−b) + ⣨ ATgAx−b,y−x ⟩ + ∥A∥2 opL(δ) 2 ∥y−x∥2 +δ

    Note that for anygAx−b∈∂f(Ax−b), by the subgradient chain rule [1, Corollary 16.72], one hasATgAx−b∈∂(f◦(A(·)−b))(x). Therefore, with the change of variablesx↦→Ax−b, (1.4) becomes f(Ay−b)≤f(Ax−b) + ⣨ ATgAx−b,y−x ⟩ + ∥A∥2 opL(δ) 2 ∥y−x∥2 +δ

  36. [36]

    Then F(x) = f(x,zx)and F(y) = infzf(y,z )≤ f(y,zx)

    For any fixedx, letzx∈argminzf(x,z ). Then F(x) = f(x,zx)and F(y) = infzf(y,z )≤ f(y,zx). Since f(x,z )is L(·)-inexactly smooth it holds for any(x,z ),( y,z′)and( gx,gz)∈ ∂f(x,z)and anyδ≥0that f(y,z′)≤f(x,z) +⟨gx,y−x⟩+ ⟨ gz,z′−z ⟩ + L(δ) 2 (∥x−y∥2 +∥z−z′∥2) +δ. From the subdifferential calculus rule [12, Theorem 1], we know that∂F(x) ={gx : (gx,0)∈ ∂f(x,z...

  37. [37]

    Taking the infimum over all selections ofδi≥0gives the claimed formula

    For any choice ofδi≥0, summing each inequality in(A.1) with ∑m i=1δi =δand utilizing the sum rule [28, Theorem 23.8] establishes a suitable quadratic upper bound for the function∑m i=1fi. Taking the infimum over all selections ofδi≥0gives the claimed formula

  38. [38]

    By [1, Theorem 18.5], lettingI∗(gx) ={i:f∗(gx) =f∗ i (gx)}the subdifferential of this maximum is given by ∂f∗(gx) = conv ⋃ i∈I∗(gx) ∂f∗ i (gx)

    Consider the function given by the inner maximumf∗(gx) = maxif∗ i (gx). By [1, Theorem 18.5], lettingI∗(gx) ={i:f∗(gx) =f∗ i (gx)}the subdifferential of this maximum is given by ∂f∗(gx) = conv ⋃ i∈I∗(gx) ∂f∗ i (gx). Fix any gx with x∈∂f∗(gx), which the above formula guarantees must take the form x= ∑ i∈I∗(gx)αixi withx i∈∂f∗ i (gx). Then for anygy, f∗(gy)...

  39. [39]

    Taking the supremum over choices ofδi≥0with ∑δi =δtightens this bound, from which the claimed inexact smoothness follows again by Theorem 2.4

    Summing the bounds(A.2) with any selection ofδi≥0verifies the ∑1/Li(δi)-inexact strong convexity of∑f∗ i . Taking the supremum over choices ofδi≥0with ∑δi =δtightens this bound, from which the claimed inexact smoothness follows again by Theorem 2.4. Proof of Theorem 2.7.For anyx,y∈Rd, gx∈∂f(x), andδ≥0, it suffices to show that the followingS f function is...