pith. sign in

arxiv: 2605.12820 · v1 · pith:JNJRFGSWnew · submitted 2026-05-12 · 🧮 math.OC

The Method of Ellipcenters for strongly convex minimization

Pith reviewed 2026-05-14 19:32 UTC · model grok-4.3

classification 🧮 math.OC
keywords ellipcentersstrongly convexlinear convergencefirst-order methodsaccelerated optimizationunconstrained minimizationill-conditioned problems
0
0 comments X

The pith

The Method of Ellipcenters converges linearly for any differentiable strongly convex objective.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the Method of Ellipcenters, an accelerated first-order scheme, achieves linear convergence not only for quadratic strongly convex problems but for any differentiable strongly convex function. Each iterate is the center of an ellipse designed to capture the ill-conditioning of the objective. This geometric construction allows the method to maintain its convergence rate across a broader class of functions. Experiments indicate that this approach performs well numerically compared to methods like FISTA and conjugate gradient. The results suggest the method is promising for even more general optimization settings.

Core claim

We derive a linear convergence rate for the Method of Ellipcenters applied to any differentiable strongly convex objective, where the iterates are chosen as centers of ellipses that are carefully designed to capture the ill-conditioning of the underlying optimization problem.

What carries the argument

Ellipcenters: the centers of ellipses constructed at each step to approximate and capture the local conditioning of the strongly convex function.

Load-bearing premise

Ellipses can be constructed at each iteration that capture the ill-conditioning of any differentiable strongly convex function without losing the linear convergence rate.

What would settle it

A counterexample of a differentiable strongly convex function for which the method fails to exhibit linear convergence would falsify the result.

read the original abstract

This work is about ME, the Method of Ellipcenters. ME was recently introduced by these very authors as a first order accelerated scheme for unconstrained minimization. Its iterates are all centers of ellipses carefully designed to somehow capture ill-conditioning of the underlying optimization problem. In the first article on ME, we were able to prove that it converges with linear rate when the objective function is quadratic and strongly convex, while here we derive convergence for any differentiable strongly convex objective. This investigation was inspired by the great performance of ME in quadratic minimization against steepest descent with exact line search, FISTA, Barzilai-Borwein and Conjugate Gradient. The experiments we carry out now, make ME even more attractive from the numerical point of view. On top of that, the theory seems promising for quite more general settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript extends the Method of Ellipcenters (ME), a first-order accelerated scheme for unconstrained minimization, from the quadratic strongly convex case (previously shown to converge linearly) to arbitrary differentiable strongly convex objectives. It claims a linear convergence rate for the general case and presents numerical experiments showing competitive or superior performance against steepest descent with exact line search, FISTA, Barzilai-Borwein, and conjugate gradient methods.

Significance. A rigorously first-order method achieving linear rates on general differentiable strongly convex problems without explicit global conditioning parameters would be a notable contribution to optimization theory and practice. The reported numerical results on quadratic and non-quadratic instances support the practical appeal, but the absence of a detailed, verifiable convergence argument limits the current impact.

major comments (2)
  1. [Abstract and §1] Abstract and §1: the assertion that linear convergence is derived for any differentiable strongly convex objective supplies no proof outline, key assumptions, or contraction-factor derivation. Without these, the extension from the quadratic case cannot be assessed and the central claim remains unverified.
  2. [Convergence section] Convergence section (presumably §3 or §4): the ellipse-parameter update rule must be shown to depend only on local gradient evaluations at the current iterate and to produce a contraction factor independent of the starting point and of any a-priori global Lipschitz or strong-convexity constants. If the construction implicitly uses such global bounds, the linear-rate guarantee does not hold for merely C^1 strongly convex functions.
minor comments (2)
  1. [Numerical experiments] Table 1 and Figure 2: the iteration counts and function-value plots should include the precise stopping tolerance and the number of gradient evaluations per iteration for each competing method to allow direct comparison of computational cost.
  2. [Method description] Notation: the definition of the ellipse center and the update for its shape matrix should be stated explicitly with equation numbers rather than described only in prose.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive suggestions. We agree that the convergence argument requires a more explicit outline and will revise the manuscript accordingly to strengthen the presentation.

read point-by-point responses
  1. Referee: [Abstract and §1] Abstract and §1: the assertion that linear convergence is derived for any differentiable strongly convex objective supplies no proof outline, key assumptions, or contraction-factor derivation. Without these, the extension from the quadratic case cannot be assessed and the central claim remains unverified.

    Authors: We agree that the abstract and §1 would benefit from an explicit proof sketch. In the revised version we will insert a concise outline immediately after the statement of the main result, listing the standing assumptions (C^1 differentiability and strong convexity) and indicating how the ellipse-centering step produces a contraction factor that depends only on the strong-convexity modulus. revision: yes

  2. Referee: [Convergence section] Convergence section (presumably §3 or §4): the ellipse-parameter update rule must be shown to depend only on local gradient evaluations at the current iterate and to produce a contraction factor independent of the starting point and of any a-priori global Lipschitz or strong-convexity constants. If the construction implicitly uses such global bounds, the linear-rate guarantee does not hold for merely C^1 strongly convex functions.

    Authors: The ellipse-parameter update is constructed from the single gradient vector at the current iterate and from the previous ellipse; no global Lipschitz constant or a-priori strong-convexity bound enters the rule. The contraction factor is then shown to be uniform across all iterates by using only the local strong-convexity inequality at each step. We will expand the convergence section with an explicit lemma that isolates this locality and proves independence from the starting point and from any global constants. revision: yes

Circularity Check

0 steps flagged

No circularity: new derivation for general strongly convex case is independent of quadratic result

full rationale

The abstract introduces ME from prior work by the same authors but claims a fresh derivation of linear-rate convergence for arbitrary differentiable strongly convex objectives. No equations, fitted parameters, or self-referential definitions appear in the provided text. The extension from the quadratic case is presented as a new proof rather than a renaming or direct reuse of prior fitted quantities. Self-citation exists but is not load-bearing for the central claim, as the current manuscript asserts an independent argument under the stated assumptions. No reduction by construction is exhibited.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is limited to the setting stated there.

axioms (1)
  • domain assumption The objective function is differentiable and strongly convex.
    Explicitly required for the new convergence statement.

pith-pipeline@v0.9.0 · 5437 in / 975 out tokens · 38731 ms · 2026-05-14T19:32:12.041251+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

  1. [1]

    Introducing the method of ellipcenters, a new first order technique for unconstrained optimization.arXiv, 2025

    Roger Behling and Ramyro Correa and Eduarda Ferreira and Vincent Guigues. Introducing the method of ellipcenters, a new first order technique for unconstrained optimization.arXiv, 2025

  2. [2]

    Two-Point Step Size Gradient Methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988

    Barzilai, Jonathan and Borwein, Jonathan M. Two-Point Step Size Gradient Methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988. 15 Method n Iterations Optimal value ME 100 2 -524.6 Barzilai-Borwein long steps 100 7 -524.6 Barzilai-Borwein short steps 100 4 -524.6 ME 1000 2 -751.4 Barzilai-Borwein long steps 1000 7 -751.4 Barzilai-Borwein short ste...

  3. [3]

    IMAGING SCIENCES, 2 (1):183-202, 2009

    Beck, Amir and Teboulle, Marc A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse ProblemsSIAM J. IMAGING SCIENCES, 2 (1):183-202, 2009

  4. [4]

    M´ ethode g´ en´ erale pour la r´ esolution des syst` emes d’´ equations simultan´ ees

    Cauchy, Augustin-Louis. M´ ethode g´ en´ erale pour la r´ esolution des syst` emes d’´ equations simultan´ ees. Comptes Rendus Hebdomadaires des S´ eances de l’Acad´ emie des Sciences, 25:536–538, 1847

  5. [5]

    Practical Methods of Optimization (2nd ed.)New York: John Wiley & Son, 1987

    Fletcher, R. Practical Methods of Optimization (2nd ed.)New York: John Wiley & Son, 1987

  6. [6]

    and Reeves, C

    Fletcher, R. and Reeves, C. M. Function Minimization by Conjugate Gradients.The Computer Journal, 7(2):149–154, 1964

  7. [7]

    and Karas

    Gonzaga, C. and Karas. E. Fine tuning Nesterov’s steepest descent algorithm for differentiable convex programmingMathematical Programming, 138:141–166,2013

  8. [8]

    On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 1–19, 2023

    Grimmer, B. On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 1–19, 2023

  9. [9]

    A polynomial algorithm in linear programming.Soviet Mathematics Doklady, 20:191–194, 1979

    Khachiyan, Leonid G. A polynomial algorithm in linear programming.Soviet Mathematics Doklady, 20:191–194, 1979. 16 Method n Iterations Optimal value ME 100 2 4.61 Barzilai-Borwein long steps 100 7 4.61 Barzilai-Borwein short steps 100 4 4.61 ME 1000 2 6.91 Barzilai-Borwein long steps 1000 7 6.91 Barzilai-Borwein short steps 1000 4 6.91 ME 2000 2 8,0 Barzi...

  10. [10]

    Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization Mathematical Programming, 149(1-2):1–45,2015

    Lan, G. Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization Mathematical Programming, 149(1-2):1–45,2015

  11. [11]

    and Monteiro, R

    Liang, J. and Monteiro, R. D. C. An average curvature accelerated composite gradient method for non- convex smooth composite optimization problemsSIAM Journal on Optimization, 31(1):217–243,2021

  12. [12]

    and Monteiro, R

    Liang, J. and Monteiro, R. D. C. Average curvature FISTA for nonconvex smooth composite optimization problemsComputational Optimization and Applications, 86(1):275–302,2023

  13. [13]

    and Malitsky, Y

    Mishchenko, K. and Malitsky, Y. Adaptive gradient descent without descent37th International Conference on Machine Learning (ICML 2020),2020

  14. [14]

    A method of solving a convex programming problem with convergence rateO(1/k 2).Soviet 17 Mathematics Doklady, 27(2):372–376, 1983

    Nesterov, Y. A method of solving a convex programming problem with convergence rateO(1/k 2).Soviet 17 Mathematics Doklady, 27(2):372–376, 1983

  15. [15]

    Universal gradient methods for convex optimization problemsMathematical Programming, 152(1):381–404,2015

    Nesterov, Y. Universal gradient methods for convex optimization problemsMathematical Programming, 152(1):381–404,2015

  16. [16]

    and Grimmer, B

    Renegar, J. and Grimmer, B. A simple nearly optimal restart scheme for speeding up first-order methods Foundations of Computational Mathematics, 22(1):211–256,2022

  17. [17]

    Convergence Conditions for Ascent Methods.SIAM Review, 11(2):226–235, 1969

    Wolfe, Philip. Convergence Conditions for Ascent Methods.SIAM Review, 11(2):226–235, 1969

  18. [18]

    and Ma, S

    Zhou, D. and Ma, S. and Yang, J. AdaBB: Adaptive Barzilai-Borwein method for convex optimization Mathematics of Operations Research, 2025. 18