The Method of Ellipcenters for strongly convex minimization
Pith reviewed 2026-05-14 19:32 UTC · model grok-4.3
The pith
The Method of Ellipcenters converges linearly for any differentiable strongly convex objective.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We derive a linear convergence rate for the Method of Ellipcenters applied to any differentiable strongly convex objective, where the iterates are chosen as centers of ellipses that are carefully designed to capture the ill-conditioning of the underlying optimization problem.
What carries the argument
Ellipcenters: the centers of ellipses constructed at each step to approximate and capture the local conditioning of the strongly convex function.
Load-bearing premise
Ellipses can be constructed at each iteration that capture the ill-conditioning of any differentiable strongly convex function without losing the linear convergence rate.
What would settle it
A counterexample of a differentiable strongly convex function for which the method fails to exhibit linear convergence would falsify the result.
read the original abstract
This work is about ME, the Method of Ellipcenters. ME was recently introduced by these very authors as a first order accelerated scheme for unconstrained minimization. Its iterates are all centers of ellipses carefully designed to somehow capture ill-conditioning of the underlying optimization problem. In the first article on ME, we were able to prove that it converges with linear rate when the objective function is quadratic and strongly convex, while here we derive convergence for any differentiable strongly convex objective. This investigation was inspired by the great performance of ME in quadratic minimization against steepest descent with exact line search, FISTA, Barzilai-Borwein and Conjugate Gradient. The experiments we carry out now, make ME even more attractive from the numerical point of view. On top of that, the theory seems promising for quite more general settings.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript extends the Method of Ellipcenters (ME), a first-order accelerated scheme for unconstrained minimization, from the quadratic strongly convex case (previously shown to converge linearly) to arbitrary differentiable strongly convex objectives. It claims a linear convergence rate for the general case and presents numerical experiments showing competitive or superior performance against steepest descent with exact line search, FISTA, Barzilai-Borwein, and conjugate gradient methods.
Significance. A rigorously first-order method achieving linear rates on general differentiable strongly convex problems without explicit global conditioning parameters would be a notable contribution to optimization theory and practice. The reported numerical results on quadratic and non-quadratic instances support the practical appeal, but the absence of a detailed, verifiable convergence argument limits the current impact.
major comments (2)
- [Abstract and §1] Abstract and §1: the assertion that linear convergence is derived for any differentiable strongly convex objective supplies no proof outline, key assumptions, or contraction-factor derivation. Without these, the extension from the quadratic case cannot be assessed and the central claim remains unverified.
- [Convergence section] Convergence section (presumably §3 or §4): the ellipse-parameter update rule must be shown to depend only on local gradient evaluations at the current iterate and to produce a contraction factor independent of the starting point and of any a-priori global Lipschitz or strong-convexity constants. If the construction implicitly uses such global bounds, the linear-rate guarantee does not hold for merely C^1 strongly convex functions.
minor comments (2)
- [Numerical experiments] Table 1 and Figure 2: the iteration counts and function-value plots should include the precise stopping tolerance and the number of gradient evaluations per iteration for each competing method to allow direct comparison of computational cost.
- [Method description] Notation: the definition of the ellipse center and the update for its shape matrix should be stated explicitly with equation numbers rather than described only in prose.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive suggestions. We agree that the convergence argument requires a more explicit outline and will revise the manuscript accordingly to strengthen the presentation.
read point-by-point responses
-
Referee: [Abstract and §1] Abstract and §1: the assertion that linear convergence is derived for any differentiable strongly convex objective supplies no proof outline, key assumptions, or contraction-factor derivation. Without these, the extension from the quadratic case cannot be assessed and the central claim remains unverified.
Authors: We agree that the abstract and §1 would benefit from an explicit proof sketch. In the revised version we will insert a concise outline immediately after the statement of the main result, listing the standing assumptions (C^1 differentiability and strong convexity) and indicating how the ellipse-centering step produces a contraction factor that depends only on the strong-convexity modulus. revision: yes
-
Referee: [Convergence section] Convergence section (presumably §3 or §4): the ellipse-parameter update rule must be shown to depend only on local gradient evaluations at the current iterate and to produce a contraction factor independent of the starting point and of any a-priori global Lipschitz or strong-convexity constants. If the construction implicitly uses such global bounds, the linear-rate guarantee does not hold for merely C^1 strongly convex functions.
Authors: The ellipse-parameter update is constructed from the single gradient vector at the current iterate and from the previous ellipse; no global Lipschitz constant or a-priori strong-convexity bound enters the rule. The contraction factor is then shown to be uniform across all iterates by using only the local strong-convexity inequality at each step. We will expand the convergence section with an explicit lemma that isolates this locality and proves independence from the starting point and from any global constants. revision: yes
Circularity Check
No circularity: new derivation for general strongly convex case is independent of quadratic result
full rationale
The abstract introduces ME from prior work by the same authors but claims a fresh derivation of linear-rate convergence for arbitrary differentiable strongly convex objectives. No equations, fitted parameters, or self-referential definitions appear in the provided text. The extension from the quadratic case is presented as a new proof rather than a renaming or direct reuse of prior fitted quantities. Self-citation exists but is not load-bearing for the central claim, as the current manuscript asserts an independent argument under the stated assumptions. No reduction by construction is exhibited.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The objective function is differentiable and strongly convex.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Its iterates are all centers of ellipses carefully designed to somehow capture ill-conditioning... ME1 E_k is an ellipse contained in Π_k; ME2 E_k is orthogonal to ∇f(x_k) at x_k; ME3 E_k is orthogonal to ∇f(y_k) at y_k.
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we derive convergence for any differentiable strongly convex objective... f(x_{k+1}) ≤ f(z_k) < f(x_k) ... lim ||∇f(x_k)|| = 0
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Roger Behling and Ramyro Correa and Eduarda Ferreira and Vincent Guigues. Introducing the method of ellipcenters, a new first order technique for unconstrained optimization.arXiv, 2025
work page 2025
-
[2]
Two-Point Step Size Gradient Methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988
Barzilai, Jonathan and Borwein, Jonathan M. Two-Point Step Size Gradient Methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988. 15 Method n Iterations Optimal value ME 100 2 -524.6 Barzilai-Borwein long steps 100 7 -524.6 Barzilai-Borwein short steps 100 4 -524.6 ME 1000 2 -751.4 Barzilai-Borwein long steps 1000 7 -751.4 Barzilai-Borwein short ste...
work page 1988
-
[3]
IMAGING SCIENCES, 2 (1):183-202, 2009
Beck, Amir and Teboulle, Marc A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse ProblemsSIAM J. IMAGING SCIENCES, 2 (1):183-202, 2009
work page 2009
-
[4]
M´ ethode g´ en´ erale pour la r´ esolution des syst` emes d’´ equations simultan´ ees
Cauchy, Augustin-Louis. M´ ethode g´ en´ erale pour la r´ esolution des syst` emes d’´ equations simultan´ ees. Comptes Rendus Hebdomadaires des S´ eances de l’Acad´ emie des Sciences, 25:536–538, 1847
-
[5]
Practical Methods of Optimization (2nd ed.)New York: John Wiley & Son, 1987
Fletcher, R. Practical Methods of Optimization (2nd ed.)New York: John Wiley & Son, 1987
work page 1987
-
[6]
Fletcher, R. and Reeves, C. M. Function Minimization by Conjugate Gradients.The Computer Journal, 7(2):149–154, 1964
work page 1964
- [7]
-
[8]
Grimmer, B. On optimal universal first-order methods for minimizing heterogeneous sums.Optimization Letters, 1–19, 2023
work page 2023
-
[9]
A polynomial algorithm in linear programming.Soviet Mathematics Doklady, 20:191–194, 1979
Khachiyan, Leonid G. A polynomial algorithm in linear programming.Soviet Mathematics Doklady, 20:191–194, 1979. 16 Method n Iterations Optimal value ME 100 2 4.61 Barzilai-Borwein long steps 100 7 4.61 Barzilai-Borwein short steps 100 4 4.61 ME 1000 2 6.91 Barzilai-Borwein long steps 1000 7 6.91 Barzilai-Borwein short steps 1000 4 6.91 ME 2000 2 8,0 Barzi...
work page 1979
-
[10]
Lan, G. Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization Mathematical Programming, 149(1-2):1–45,2015
work page 2015
-
[11]
Liang, J. and Monteiro, R. D. C. An average curvature accelerated composite gradient method for non- convex smooth composite optimization problemsSIAM Journal on Optimization, 31(1):217–243,2021
work page 2021
-
[12]
Liang, J. and Monteiro, R. D. C. Average curvature FISTA for nonconvex smooth composite optimization problemsComputational Optimization and Applications, 86(1):275–302,2023
work page 2023
-
[13]
Mishchenko, K. and Malitsky, Y. Adaptive gradient descent without descent37th International Conference on Machine Learning (ICML 2020),2020
work page 2020
-
[14]
Nesterov, Y. A method of solving a convex programming problem with convergence rateO(1/k 2).Soviet 17 Mathematics Doklady, 27(2):372–376, 1983
work page 1983
-
[15]
Nesterov, Y. Universal gradient methods for convex optimization problemsMathematical Programming, 152(1):381–404,2015
work page 2015
-
[16]
Renegar, J. and Grimmer, B. A simple nearly optimal restart scheme for speeding up first-order methods Foundations of Computational Mathematics, 22(1):211–256,2022
work page 2022
-
[17]
Convergence Conditions for Ascent Methods.SIAM Review, 11(2):226–235, 1969
Wolfe, Philip. Convergence Conditions for Ascent Methods.SIAM Review, 11(2):226–235, 1969
work page 1969
- [18]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.