The Method of Ellipcenters for strongly convex minimization

Eduarda Ferreira; Ramyro Correa; Roger Behling; Vincent Guigues

arxiv: 2605.12820 · v1 · pith:JNJRFGSWnew · submitted 2026-05-12 · 🧮 math.OC

The Method of Ellipcenters for strongly convex minimization

Roger Behling , Ramyro Correa , Eduarda Ferreira , Vincent Guigues This is my paper

Pith reviewed 2026-05-14 19:32 UTC · model grok-4.3

classification 🧮 math.OC

keywords ellipcentersstrongly convexlinear convergencefirst-order methodsaccelerated optimizationunconstrained minimizationill-conditioned problems

0 comments

The pith

The Method of Ellipcenters converges linearly for any differentiable strongly convex objective.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the Method of Ellipcenters, an accelerated first-order scheme, achieves linear convergence not only for quadratic strongly convex problems but for any differentiable strongly convex function. Each iterate is the center of an ellipse designed to capture the ill-conditioning of the objective. This geometric construction allows the method to maintain its convergence rate across a broader class of functions. Experiments indicate that this approach performs well numerically compared to methods like FISTA and conjugate gradient. The results suggest the method is promising for even more general optimization settings.

Core claim

We derive a linear convergence rate for the Method of Ellipcenters applied to any differentiable strongly convex objective, where the iterates are chosen as centers of ellipses that are carefully designed to capture the ill-conditioning of the underlying optimization problem.

What carries the argument

Ellipcenters: the centers of ellipses constructed at each step to approximate and capture the local conditioning of the strongly convex function.

Load-bearing premise

Ellipses can be constructed at each iteration that capture the ill-conditioning of any differentiable strongly convex function without losing the linear convergence rate.

What would settle it

A counterexample of a differentiable strongly convex function for which the method fails to exhibit linear convergence would falsify the result.

read the original abstract

This work is about ME, the Method of Ellipcenters. ME was recently introduced by these very authors as a first order accelerated scheme for unconstrained minimization. Its iterates are all centers of ellipses carefully designed to somehow capture ill-conditioning of the underlying optimization problem. In the first article on ME, we were able to prove that it converges with linear rate when the objective function is quadratic and strongly convex, while here we derive convergence for any differentiable strongly convex objective. This investigation was inspired by the great performance of ME in quadratic minimization against steepest descent with exact line search, FISTA, Barzilai-Borwein and Conjugate Gradient. The experiments we carry out now, make ME even more attractive from the numerical point of view. On top of that, the theory seems promising for quite more general settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims to extend linear-rate convergence of the Method of Ellipcenters from quadratics to arbitrary differentiable strongly convex functions, but the abstract supplies no proof steps or ellipse-update rule, so the central claim stays uncheckable.

read the letter

The main new piece is the extension of ME beyond quadratics: the authors say they now prove linear convergence for any differentiable strongly convex objective while keeping the same ellipse-centered iterates. Numerically they show the method still beats steepest descent with exact line search, FISTA, Barzilai-Borwein, and conjugate gradient on the problems they tested, which is the part that actually looks useful right now.

Referee Report

2 major / 2 minor

Summary. The manuscript extends the Method of Ellipcenters (ME), a first-order accelerated scheme for unconstrained minimization, from the quadratic strongly convex case (previously shown to converge linearly) to arbitrary differentiable strongly convex objectives. It claims a linear convergence rate for the general case and presents numerical experiments showing competitive or superior performance against steepest descent with exact line search, FISTA, Barzilai-Borwein, and conjugate gradient methods.

Significance. A rigorously first-order method achieving linear rates on general differentiable strongly convex problems without explicit global conditioning parameters would be a notable contribution to optimization theory and practice. The reported numerical results on quadratic and non-quadratic instances support the practical appeal, but the absence of a detailed, verifiable convergence argument limits the current impact.

major comments (2)

[Abstract and §1] Abstract and §1: the assertion that linear convergence is derived for any differentiable strongly convex objective supplies no proof outline, key assumptions, or contraction-factor derivation. Without these, the extension from the quadratic case cannot be assessed and the central claim remains unverified.
[Convergence section] Convergence section (presumably §3 or §4): the ellipse-parameter update rule must be shown to depend only on local gradient evaluations at the current iterate and to produce a contraction factor independent of the starting point and of any a-priori global Lipschitz or strong-convexity constants. If the construction implicitly uses such global bounds, the linear-rate guarantee does not hold for merely C^1 strongly convex functions.

minor comments (2)

[Numerical experiments] Table 1 and Figure 2: the iteration counts and function-value plots should include the precise stopping tolerance and the number of gradient evaluations per iteration for each competing method to allow direct comparison of computational cost.
[Method description] Notation: the definition of the ellipse center and the update for its shape matrix should be stated explicitly with equation numbers rather than described only in prose.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive suggestions. We agree that the convergence argument requires a more explicit outline and will revise the manuscript accordingly to strengthen the presentation.

read point-by-point responses

Referee: [Abstract and §1] Abstract and §1: the assertion that linear convergence is derived for any differentiable strongly convex objective supplies no proof outline, key assumptions, or contraction-factor derivation. Without these, the extension from the quadratic case cannot be assessed and the central claim remains unverified.

Authors: We agree that the abstract and §1 would benefit from an explicit proof sketch. In the revised version we will insert a concise outline immediately after the statement of the main result, listing the standing assumptions (C^1 differentiability and strong convexity) and indicating how the ellipse-centering step produces a contraction factor that depends only on the strong-convexity modulus. revision: yes
Referee: [Convergence section] Convergence section (presumably §3 or §4): the ellipse-parameter update rule must be shown to depend only on local gradient evaluations at the current iterate and to produce a contraction factor independent of the starting point and of any a-priori global Lipschitz or strong-convexity constants. If the construction implicitly uses such global bounds, the linear-rate guarantee does not hold for merely C^1 strongly convex functions.

Authors: The ellipse-parameter update is constructed from the single gradient vector at the current iterate and from the previous ellipse; no global Lipschitz constant or a-priori strong-convexity bound enters the rule. The contraction factor is then shown to be uniform across all iterates by using only the local strong-convexity inequality at each step. We will expand the convergence section with an explicit lemma that isolates this locality and proves independence from the starting point and from any global constants. revision: yes

Circularity Check

0 steps flagged

No circularity: new derivation for general strongly convex case is independent of quadratic result

full rationale

The abstract introduces ME from prior work by the same authors but claims a fresh derivation of linear-rate convergence for arbitrary differentiable strongly convex objectives. No equations, fitted parameters, or self-referential definitions appear in the provided text. The extension from the quadratic case is presented as a new proof rather than a renaming or direct reuse of prior fitted quantities. Self-citation exists but is not load-bearing for the central claim, as the current manuscript asserts an independent argument under the stated assumptions. No reduction by construction is exhibited.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is limited to the setting stated there.

axioms (1)

domain assumption The objective function is differentiable and strongly convex.
Explicitly required for the new convergence statement.

pith-pipeline@v0.9.0 · 5437 in / 975 out tokens · 38731 ms · 2026-05-14T19:32:12.041251+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Its iterates are all centers of ellipses carefully designed to somehow capture ill-conditioning... ME1 E_k is an ellipse contained in Π_k; ME2 E_k is orthogonal to ∇f(x_k) at x_k; ME3 E_k is orthogonal to ∇f(y_k) at y_k.
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we derive convergence for any differentiable strongly convex objective... f(x_{k+1}) ≤ f(z_k) < f(x_k) ... lim ||∇f(x_k)|| = 0

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

Introducing the method of ellipcenters, a new first order technique for unconstrained optimization.arXiv, 2025

Roger Behling and Ramyro Correa and Eduarda Ferreira and Vincent Guigues. Introducing the method of ellipcenters, a new first order technique for unconstrained optimization.arXiv, 2025

work page 2025
[2]

Two-Point Step Size Gradient Methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988

Barzilai, Jonathan and Borwein, Jonathan M. Two-Point Step Size Gradient Methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988. 15 Method n Iterations Optimal value ME 100 2 -524.6 Barzilai-Borwein long steps 100 7 -524.6 Barzilai-Borwein short steps 100 4 -524.6 ME 1000 2 -751.4 Barzilai-Borwein long steps 1000 7 -751.4 Barzilai-Borwein short ste...

work page 1988
[3]

IMAGING SCIENCES, 2 (1):183-202, 2009

Beck, Amir and Teboulle, Marc A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse ProblemsSIAM J. IMAGING SCIENCES, 2 (1):183-202, 2009

work page 2009
[4]

M´ ethode g´ en´ erale pour la r´ esolution des syst` emes d’´ equations simultan´ ees

Cauchy, Augustin-Louis. M´ ethode g´ en´ erale pour la r´ esolution des syst` emes d’´ equations simultan´ ees. Comptes Rendus Hebdomadaires des S´ eances de l’Acad´ emie des Sciences, 25:536–538, 1847

work page
[5]

Practical Methods of Optimization (2nd ed.)New York: John Wiley & Son, 1987

Fletcher, R. Practical Methods of Optimization (2nd ed.)New York: John Wiley & Son, 1987

work page 1987
[6]

and Reeves, C

Fletcher, R. and Reeves, C. M. Function Minimization by Conjugate Gradients.The Computer Journal, 7(2):149–154, 1964

work page 1964
[7]

and Karas

Gonzaga, C. and Karas. E. Fine tuning Nesterov’s steepest descent algorithm for differentiable convex programmingMathematical Programming, 138:141–166,2013

work page 2013
[8]

On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 1–19, 2023

Grimmer, B. On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 1–19, 2023

work page 2023
[9]

A polynomial algorithm in linear programming.Soviet Mathematics Doklady, 20:191–194, 1979

Khachiyan, Leonid G. A polynomial algorithm in linear programming.Soviet Mathematics Doklady, 20:191–194, 1979. 16 Method n Iterations Optimal value ME 100 2 4.61 Barzilai-Borwein long steps 100 7 4.61 Barzilai-Borwein short steps 100 4 4.61 ME 1000 2 6.91 Barzilai-Borwein long steps 1000 7 6.91 Barzilai-Borwein short steps 1000 4 6.91 ME 2000 2 8,0 Barzi...

work page 1979
[10]

Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization Mathematical Programming, 149(1-2):1–45,2015

Lan, G. Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization Mathematical Programming, 149(1-2):1–45,2015

work page 2015
[11]

and Monteiro, R

Liang, J. and Monteiro, R. D. C. An average curvature accelerated composite gradient method for non- convex smooth composite optimization problemsSIAM Journal on Optimization, 31(1):217–243,2021

work page 2021
[12]

and Monteiro, R

Liang, J. and Monteiro, R. D. C. Average curvature FISTA for nonconvex smooth composite optimization problemsComputational Optimization and Applications, 86(1):275–302,2023

work page 2023
[13]

and Malitsky, Y

Mishchenko, K. and Malitsky, Y. Adaptive gradient descent without descent37th International Conference on Machine Learning (ICML 2020),2020

work page 2020
[14]

A method of solving a convex programming problem with convergence rateO(1/k 2).Soviet 17 Mathematics Doklady, 27(2):372–376, 1983

Nesterov, Y. A method of solving a convex programming problem with convergence rateO(1/k 2).Soviet 17 Mathematics Doklady, 27(2):372–376, 1983

work page 1983
[15]

Universal gradient methods for convex optimization problemsMathematical Programming, 152(1):381–404,2015

Nesterov, Y. Universal gradient methods for convex optimization problemsMathematical Programming, 152(1):381–404,2015

work page 2015
[16]

and Grimmer, B

Renegar, J. and Grimmer, B. A simple nearly optimal restart scheme for speeding up first-order methods Foundations of Computational Mathematics, 22(1):211–256,2022

work page 2022
[17]

Convergence Conditions for Ascent Methods.SIAM Review, 11(2):226–235, 1969

Wolfe, Philip. Convergence Conditions for Ascent Methods.SIAM Review, 11(2):226–235, 1969

work page 1969
[18]

and Ma, S

Zhou, D. and Ma, S. and Yang, J. AdaBB: Adaptive Barzilai-Borwein method for convex optimization Mathematics of Operations Research, 2025. 18

work page 2025

[1] [1]

Introducing the method of ellipcenters, a new first order technique for unconstrained optimization.arXiv, 2025

Roger Behling and Ramyro Correa and Eduarda Ferreira and Vincent Guigues. Introducing the method of ellipcenters, a new first order technique for unconstrained optimization.arXiv, 2025

work page 2025

[2] [2]

Two-Point Step Size Gradient Methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988

Barzilai, Jonathan and Borwein, Jonathan M. Two-Point Step Size Gradient Methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988. 15 Method n Iterations Optimal value ME 100 2 -524.6 Barzilai-Borwein long steps 100 7 -524.6 Barzilai-Borwein short steps 100 4 -524.6 ME 1000 2 -751.4 Barzilai-Borwein long steps 1000 7 -751.4 Barzilai-Borwein short ste...

work page 1988

[3] [3]

IMAGING SCIENCES, 2 (1):183-202, 2009

Beck, Amir and Teboulle, Marc A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse ProblemsSIAM J. IMAGING SCIENCES, 2 (1):183-202, 2009

work page 2009

[4] [4]

M´ ethode g´ en´ erale pour la r´ esolution des syst` emes d’´ equations simultan´ ees

Cauchy, Augustin-Louis. M´ ethode g´ en´ erale pour la r´ esolution des syst` emes d’´ equations simultan´ ees. Comptes Rendus Hebdomadaires des S´ eances de l’Acad´ emie des Sciences, 25:536–538, 1847

work page

[5] [5]

Practical Methods of Optimization (2nd ed.)New York: John Wiley & Son, 1987

Fletcher, R. Practical Methods of Optimization (2nd ed.)New York: John Wiley & Son, 1987

work page 1987

[6] [6]

and Reeves, C

Fletcher, R. and Reeves, C. M. Function Minimization by Conjugate Gradients.The Computer Journal, 7(2):149–154, 1964

work page 1964

[7] [7]

and Karas

Gonzaga, C. and Karas. E. Fine tuning Nesterov’s steepest descent algorithm for differentiable convex programmingMathematical Programming, 138:141–166,2013

work page 2013

[8] [8]

On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 1–19, 2023

Grimmer, B. On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 1–19, 2023

work page 2023

[9] [9]

A polynomial algorithm in linear programming.Soviet Mathematics Doklady, 20:191–194, 1979

Khachiyan, Leonid G. A polynomial algorithm in linear programming.Soviet Mathematics Doklady, 20:191–194, 1979. 16 Method n Iterations Optimal value ME 100 2 4.61 Barzilai-Borwein long steps 100 7 4.61 Barzilai-Borwein short steps 100 4 4.61 ME 1000 2 6.91 Barzilai-Borwein long steps 1000 7 6.91 Barzilai-Borwein short steps 1000 4 6.91 ME 2000 2 8,0 Barzi...

work page 1979

[10] [10]

Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization Mathematical Programming, 149(1-2):1–45,2015

Lan, G. Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization Mathematical Programming, 149(1-2):1–45,2015

work page 2015

[11] [11]

and Monteiro, R

Liang, J. and Monteiro, R. D. C. An average curvature accelerated composite gradient method for non- convex smooth composite optimization problemsSIAM Journal on Optimization, 31(1):217–243,2021

work page 2021

[12] [12]

and Monteiro, R

Liang, J. and Monteiro, R. D. C. Average curvature FISTA for nonconvex smooth composite optimization problemsComputational Optimization and Applications, 86(1):275–302,2023

work page 2023

[13] [13]

and Malitsky, Y

Mishchenko, K. and Malitsky, Y. Adaptive gradient descent without descent37th International Conference on Machine Learning (ICML 2020),2020

work page 2020

[14] [14]

A method of solving a convex programming problem with convergence rateO(1/k 2).Soviet 17 Mathematics Doklady, 27(2):372–376, 1983

Nesterov, Y. A method of solving a convex programming problem with convergence rateO(1/k 2).Soviet 17 Mathematics Doklady, 27(2):372–376, 1983

work page 1983

[15] [15]

Universal gradient methods for convex optimization problemsMathematical Programming, 152(1):381–404,2015

Nesterov, Y. Universal gradient methods for convex optimization problemsMathematical Programming, 152(1):381–404,2015

work page 2015

[16] [16]

and Grimmer, B

Renegar, J. and Grimmer, B. A simple nearly optimal restart scheme for speeding up first-order methods Foundations of Computational Mathematics, 22(1):211–256,2022

work page 2022

[17] [17]

Convergence Conditions for Ascent Methods.SIAM Review, 11(2):226–235, 1969

Wolfe, Philip. Convergence Conditions for Ascent Methods.SIAM Review, 11(2):226–235, 1969

work page 1969

[18] [18]

and Ma, S

Zhou, D. and Ma, S. and Yang, J. AdaBB: Adaptive Barzilai-Borwein method for convex optimization Mathematics of Operations Research, 2025. 18

work page 2025