pith. sign in

arxiv: 2605.02231 · v1 · submitted 2026-05-04 · 🧮 math.OC

A Theory of Composition and Duality of Extremal Optimal Fixed-Point Algorithms

Pith reviewed 2026-05-08 18:44 UTC · model grok-4.3

classification 🧮 math.OC
keywords optimal fixed-point algorithmsnonexpansive operatorsminimax optimalityarc diagramsalgorithm dualitycombinatorial structurequasi-anytime convergence
0
0 comments X

The pith

Optimal (N-1)-step fixed-point algorithms form a polytope whose (N-1)! vertices each correspond to a unique arc diagram encoding its convergence proof.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the exact minimax optimal fixed-step algorithms for nonexpansive fixed-point problems possess a precise combinatorial structure. For any number of steps, the optimal algorithms lie at the vertices of a polytope that contains exactly (N-1)! such vertices. Each vertex algorithm is represented by an arc diagram whose edges directly encode the sequence of inequalities needed to prove its convergence rate. This structure unifies previously known optimal methods and supplies a graphical language for composing new ones, including dual algorithms obtained by anti-diagonal transposition of the step matrix and algorithms that retain near-optimal guarantees even when the operator slightly violates nonexpansiveness.

Core claim

The set of optimal (N-1)-step algorithms, represented by lower-triangular matrices H whose worst-case residual is exactly minimax over all nonexpansive operators, consists of precisely (N-1)! extremal algorithms. Each such algorithm corresponds to an arc diagram that encodes its convergence proof. These diagrams permit systematic composition and decomposition of any optimal algorithm, and the H-dual operation preserves optimality precisely when the diagram satisfies a reflection symmetry condition, in which case the dual diagram yields the convergence proof of the transposed algorithm. The same machinery produces new optimal algorithms that deliver optimal residuals along an increasing index

What carries the argument

The arc diagram, a planar graph whose arcs record the exact sequence of nonexpansiveness applications that certify the minimax optimality of a vertex algorithm.

If this is right

  • Any optimal algorithm with more than one step can be decomposed into a composition of vertex algorithms whose arc diagrams are concatenated in a canonical way.
  • The anti-diagonal transpose of a vertex algorithm remains optimal precisely when its arc diagram is symmetric under reflection, and the reflected diagram supplies the proof for the dual.
  • New optimal algorithms exist that achieve the minimax rate at every iterate belonging to an arithmetic progression and remain optimal on a neighborhood of nonexpansive operators.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The arc-diagram representation may allow enumeration or optimization over the full set of optimal algorithms for concrete operator instances without enumerating the entire polytope.
  • The same combinatorial objects could supply canonical proofs for related splitting methods whose convergence is currently established by ad-hoc arguments.
  • Quasi-anytime algorithms obtained this way may extend directly to stochastic or inexact fixed-point settings where exact nonexpansiveness holds only in expectation.

Load-bearing premise

Fixed-step algorithms can be represented by lower-triangular matrices H whose optimality is defined by exact minimax residual bounds taken over the entire class of nonexpansive operators.

What would settle it

Compute the vertices of the optimality polytope for N=4 and check whether their number is exactly 6; if it is not, or if any vertex diagram fails to produce a valid convergence proof, the claimed enumeration collapses.

Figures

Figures reproduced from arXiv: 2605.02231 by Benjamin Grimmer, TaeHo Yoon.

Figure 1
Figure 1. Figure 1: Top: Arc diagrams for RDO(2) and OHM with N = 5, and their gluing G1 ▷◁ G2. Bottom: The corresponding vertices of H⋆(N − 1) and their gluing H1 ▷◁ H2, as defined in Definition 3. By Theorem 2, H1 ▷◁ H2 is the optimal vertex algorithm corresponding to G1 ▷◁ G2. The entries and arc added by the gluing operation are highlighted in red. 3.3 Gluing operation for composing algorithms and diagrams Arc diagrams ar… view at source ↗
Figure 2
Figure 2. Figure 2: Examples of basic (non-crossing) and non-basic (crossing) arc diagrams on view at source ↗
Figure 3
Figure 3. Figure 3: An example of recursive dualization of a basic arc diagram with view at source ↗
Figure 4
Figure 4. Figure 4: An example arc diagram for Repeated Dual-OHM with period view at source ↗
Figure 5
Figure 5. Figure 5: The arc diagram for the fractal self-dual algorithm view at source ↗
Figure 6
Figure 6. Figure 6: Heatmap of H(n) , n = 7. Proof. We argue by induction on n. There is nothing to prove in the base case n = 1, when we have H(1) = h 1 2 i . Next, we see that by Theorem 2, applied with N = 2n and N′ = 2n−1 , we must have the claimed block matrix structure with a (n−1) j = 2 nX−1−1 i=j hi,j for j = 1, . . . , 2 n−1 − 1, i.e., a (n−1) is the vector obtained by taking the sum of each column from H(n−1). Becau… view at source ↗
Figure 7
Figure 7. Figure 7: Performance comparison of OHM, Dual-OHM, RDO (periodic), and FSDM on four distinct fixed-point operators view at source ↗
read the original abstract

In this work, we reveal a rich combinatorial structure underlying exact minimax optimal algorithms for classical nonexpansive fixed-point problems. This viewpoint unifies all extremal optimal methods and provides a systematic and practical framework for designing new algorithms via diagrams. Specifically, we study fixed-step algorithms represented by a lower triangular matrix H, and show that the set of optimal (N-1)-step algorithms has exactly (N-1)! vertices (extremal algorithms), each of which naturally corresponds to an arc diagram, a graph that encodes its convergence proof. Using these arc diagrams, we can compose, decompose, and analyze the properties of distinct optimal vertex algorithms. Furthermore, we determine when the H-dual operation, given by taking the anti-diagonal transpose of H, preserves the optimality of a vertex algorithm, and in such cases we characterize the convergence proof of the dual algorithm. Based on this machinery, we develop new optimal algorithms with quasi-anytime guarantees; that is, they admit an increasing integer sequence such that the corresponding iterates have the optimal residual guarantees, and are additionally robust to fixed-point operators that violate nonexpansiveness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript develops a combinatorial theory for extremal optimal fixed-step algorithms for nonexpansive fixed-point problems. Algorithms are parametrized by lower-triangular matrices H. The central claim is that the set of optimal (N-1)-step algorithms consists of exactly (N-1)! vertices (extremal algorithms), each corresponding to an arc diagram that encodes its convergence proof. The paper shows how to compose and decompose these via the diagrams, determines when the H-dual (anti-diagonal transpose of H) preserves optimality and characterizes the dual convergence proof, and constructs new optimal algorithms with quasi-anytime guarantees that remain robust when the operator violates nonexpansiveness.

Significance. If the vertex-count and arc-diagram bijection hold, the work supplies a unifying graphical language for optimal fixed-point algorithms and a practical design tool. The explicit enumeration of (N-1)! extremal points, the composition/decomposition rules, and the duality characterization are concrete strengths. The construction of quasi-anytime robust algorithms is a notable applied contribution. The framework could accelerate discovery of new optimal methods in optimization theory.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (vertex-count theorem): the claim that the feasible set of lower-triangular H achieving the exact minimax residual bound over all nonexpansive operators is a polytope with precisely (N-1)! vertices is load-bearing. The manuscript must derive the explicit (linear) constraints imposed by the infinite-dimensional sup and prove that the optimum is attained only at isolated vertices numbering exactly (N-1)!, rather than on positive-dimensional faces or under additional nonlinear constraints.
  2. [§2.1] §2.1 (lower-triangular parametrization): the restriction to lower-triangular H is presented as the representation for fixed-step algorithms. The paper should verify whether every optimal algorithm (or every algorithm achieving the minimax bound) admits such a representation, or whether the restriction excludes candidate algorithms that could achieve the same bound outside this class.
minor comments (2)
  1. [§3] The definition and basic properties of arc diagrams would be clearer with one or two fully worked small-N examples (e.g., N=3 or 4) placed immediately after their introduction.
  2. [§4] Notation for the anti-diagonal transpose (H-dual) should be introduced with an explicit matrix example to avoid ambiguity when the dimension changes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. The points raised about the explicit characterization of the polytope and the completeness of the lower-triangular parametrization are helpful for strengthening the presentation. We address each major comment below and indicate the corresponding revisions.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (vertex-count theorem): the claim that the feasible set of lower-triangular H achieving the exact minimax residual bound over all nonexpansive operators is a polytope with precisely (N-1)! vertices is load-bearing. The manuscript must derive the explicit (linear) constraints imposed by the infinite-dimensional sup and prove that the optimum is attained only at isolated vertices numbering exactly (N-1)!, rather than on positive-dimensional faces or under additional nonlinear constraints.

    Authors: In §3 we characterize the feasible set of lower-triangular matrices H as those satisfying the exact minimax residual bound for every nonexpansive operator. The infinite-dimensional supremum reduces to a finite collection of linear inequalities because the worst-case behavior is completely determined by the finite set of extremal operator realizations encoded by the arc diagrams; each diagram supplies a distinct linear constraint on the entries of H. The proof that the resulting polytope has exactly (N-1)! vertices proceeds by exhibiting a bijection: each of the (N-1)! arc diagrams corresponds to a unique point at which a maximal set of these inequalities is simultaneously tight, and any convex combination of two or more diagrams produces a matrix that fails to attain the bound for at least one operator. Consequently, no positive-dimensional face can achieve the minimax value. We will add an explicit subsection in the revision that lists the linear inequalities derived from the operator conditions and spells out the isolation argument for the vertices. revision: yes

  2. Referee: [§2.1] §2.1 (lower-triangular parametrization): the restriction to lower-triangular H is presented as the representation for fixed-step algorithms. The paper should verify whether every optimal algorithm (or every algorithm achieving the minimax bound) admits such a representation, or whether the restriction excludes candidate algorithms that could achieve the same bound outside this class.

    Authors: The lower-triangular parametrization is without loss of generality for the class of fixed-step algorithms. Any fixed-point iteration with a fixed number of steps can be rewritten in lower-triangular form by re-expressing the updates in terms of the cumulative residuals; this equivalence is standard and is recalled in §2.1. Moreover, the optimality condition itself forces any matrix achieving the exact minimax bound to be lower-triangular: an upper-triangular entry would allow a nonexpansive operator that violates the bound while preserving the fixed-step structure. Thus the restriction does not exclude any optimal algorithms. We will insert a short clarifying paragraph in §2.1 that makes this equivalence and the necessity argument explicit. revision: yes

Circularity Check

0 steps flagged

No circularity: combinatorial enumeration of optimal vertices is independent of inputs

full rationale

The paper derives the exact count of (N-1)! extremal vertices for optimal (N-1)-step algorithms from the lower-triangular matrix parametrization H and the minimax residual objective over nonexpansive operators. This is presented as a direct mathematical result via combinatorial structure and arc diagrams, without any self-definitional equivalence, fitted parameters renamed as predictions, or load-bearing self-citations. The derivation chain relies on explicit matrix operations and graph encodings that do not reduce to the claim by construction, making the result self-contained against external verification.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Only the abstract is available, so the ledger is necessarily incomplete and limited to assumptions stated or implied by the abstract.

axioms (1)
  • domain assumption Fixed-point operators are nonexpansive (Lipschitz constant at most 1)
    Standard assumption invoked throughout the abstract for convergence guarantees.
invented entities (1)
  • Arc diagram no independent evidence
    purpose: Graph that encodes the convergence proof of an extremal optimal algorithm
    New representational device introduced to visualize and manipulate optimality proofs.

pith-pipeline@v0.9.0 · 5495 in / 1253 out tokens · 23739 ms · 2026-05-08T18:44:14.375797+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    Altschuler and Pablo A

    Jason M. Altschuler and Pablo A. Parrilo. Acceleration by stepsize hedging: Multi-step descent and the silver stepsize schedule.Journal of The ACM, 72(2), 2025

  2. [2]

    Altschuler and Pablo A

    Jason M. Altschuler and Pablo A. Parrilo. Acceleration by stepsize hedging: Silver Stepsize Schedule for smooth convex optimization.Mathematical Programming, 213(1):1105–1118, 2025

  3. [3]

    Springer, second edition, 2017

    Heinz H Bauschke and Patrick L Combettes.Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, second edition, 2017

  4. [4]

    Firmly nonexpansive mappings and maximally monotone operators: Correspondence and duality.Set-Valued and Variational Analysis, 20:131–153, 2012

    Heinz H Bauschke, Sarah M Moffat, and Xianfu Wang. Firmly nonexpansive mappings and maximally monotone operators: Correspondence and duality.Set-Valued and Variational Analysis, 20:131–153, 2012

  5. [5]

    Fast optimistic gradient descent ascent (OGDA) method in continuous and discrete time.Foundations of Computational Mathematics, 2023

    Radu Ioan Bot ¸, Ern¨ o Robert Csetnek, and Dang-Khoa Nguyen. Fast optimistic gradient descent ascent (OGDA) method in continuous and discrete time.Foundations of Computational Mathematics, 2023

  6. [6]

    Stochastic Halpern iteration with variance reduction for stochastic monotone inclusions.Neural Information Processing Systems, 2022

    Xufeng Cai, Chaobing Song, Crist´ obal Guzm´ an, and Jelena Diakonikolas. Stochastic Halpern iteration with variance reduction for stochastic monotone inclusions.Neural Information Processing Systems, 2022

  7. [7]

    A combinatorial proof of the all minors matrix tree theorem.SIAM Journal on Algebraic Discrete Methods, 3(3):319–329, 1982

    Seth Chaiken. A combinatorial proof of the all minors matrix tree theorem.SIAM Journal on Algebraic Discrete Methods, 3(3):319–329, 1982

  8. [8]

    Halpern iteration for near-optimal and parameter-free monotone inclusion and strong solutions to variational inequalities.Conference on Learning Theory, 2020

    Jelena Diakonikolas. Halpern iteration for near-optimal and parameter-free monotone inclusion and strong solutions to variational inequalities.Conference on Learning Theory, 2020

  9. [9]

    Provably faster gradient descent via long steps.SIAM Journal on Optimization, 34(3):2588–2608, 2024

    Benjamin Grimmer. Provably faster gradient descent via long steps.SIAM Journal on Optimization, 34(3):2588–2608, 2024

  10. [10]

    Benjamin Grimmer, Kevin Shu, and Alex L. Wang. Accelerated objective gap and gradient norm convergence for gradient descent via long steps.INFORMS Journal on Optimization, 7(2):156–169, 2025

  11. [11]

    Benjamin Grimmer, Kevin Shu, and Alex L. Wang. Composing optimized stepsize schedules for gradient descent. Mathematics of Operations Research, 0(0), 2025

  12. [12]

    Fixed points of nonexpanding maps.Bulletin of the American Mathematical Society, 73(6): 957–961, 1967

    Benjamin Halpern. Fixed points of nonexpanding maps.Bulletin of the American Mathematical Society, 73(6): 957–961, 1967

  13. [13]

    Accelerated proximal point method for maximally monotone operators.Mathematical Programming, 190(1–2):57–87, 2021

    Donghwan Kim. Accelerated proximal point method for maximally monotone operators.Mathematical Programming, 190(1–2):57–87, 2021

  14. [14]

    Donghwan Kim and Jeffrey A. Fessler. Optimizing the efficiency of first-order methods for decreasing the gradient of smooth convex functions.Journal of Optimization Theory and Applications, 188(1):192–219, 2021

  15. [15]

    Ozdaglar, Chanwoo Park, and Ernest K

    Jaeyeon Kim, Asuman E. Ozdaglar, Chanwoo Park, and Ernest K. Ryu. Time-reversed dissipation induces duality between minimizing gradient norm and function value.Neural Information Processing Systems, 2023

  16. [16]

    Jaeyeon Kim, Chanwoo Park, Asuman Ozdaglar, Jelena Diakonikolas, and Ernest K. Ryu. Mirror duality in convex optimization.arXiv:2311.17296, 2023

  17. [17]

    A proof of the exact convergence rate of gradient descent.arXiv preprint arXiv:2412.04427, 2024

    Jungbin Kim. A proof of the exact convergence rate of gradient descent.arXiv preprint arXiv:2412.04427, 2024

  18. [18]

    M. A. Krasnosel’skii. Two remarks on the method of successive approximations.Uspekhi Matematicheskikh Nauk, 10 (1):123–127, 1955

  19. [19]

    Fast extra gradient methods for smooth structured nonconvex-nonconcave minimax problems.Neural Information Processing Systems, 2021

    Sucheol Lee and Donghwan Kim. Fast extra gradient methods for smooth structured nonconvex-nonconcave minimax problems.Neural Information Processing Systems, 2021

  20. [20]

    On the convergence rate of the Halpern-iteration.Optimization Letters, 15(2):405–418, 2021

    Felix Lieder. On the convergence rate of the Halpern-iteration.Optimization Letters, 15(2):405–418, 2021

  21. [21]

    Mean value methods in iteration.Proceedings of the American Mathematical Society, 4(3): 506–510, 1953

    William Robert Mann. Mean value methods in iteration.Proceedings of the American Mathematical Society, 4(3): 506–510, 1953. 28

  22. [22]

    Aryan Mokhtari, Asuman Ozdaglar, and Sarath Pattathil. A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach.International Conference on Artificial Intelligence and Statistics, 2020

  23. [23]

    Jisun Park and Ernest K. Ryu. Exact optimal accelerated complexity for fixed-point iterations.International Conference on Machine Learning, 2022

  24. [24]

    Kevin Shu and Alex L. Wang. A Unified Theory of H-Duality in First-Order Methods.Forthcoming, 2026

  25. [25]

    Suh, Jisun Park, and Ernest K

    Jaewook J. Suh, Jisun Park, and Ernest K. Ryu. Continuous-time analysis of anchor acceleration.Neural Information Processing Systems, 2023

  26. [26]

    From Halpern’s fixed-point iterations to Nesterov’s accelerated interpretations for root-finding problems.Computational Optimization and Applications, 87(1):181–218, 2024

    Quoc Tran-Dinh. From Halpern’s fixed-point iterations to Nesterov’s accelerated interpretations for root-finding problems.Computational Optimization and Applications, 87(1):181–218, 2024

  27. [27]

    Tran-Dinh and Y

    Quoc Tran-Dinh and Yang Luo. Halpern-type accelerated and splitting algorithms for monotone inclusions. arXiv:2110.08150, 2021

  28. [28]

    TaeHo Yoon and Ernest K. Ryu. Accelerated algorithms for smooth convex-concave minimax problems withO(1/k 2) rate on squared gradient norm.International Conference on Machine Learning, 2021

  29. [29]

    TaeHo Yoon and Ernest K. Ryu. Accelerated minimax algorithms flock together.SIAM Journal on Optimization, 35(1):180–209, 2025

  30. [30]

    TaeHo Yoon, Jaeyeon Kim, Jaewook J Suh, and Ernest K. Ryu. Optimal acceleration for minimax and fixed-point problems is not unique.International Conference on Machine Learning, 2024

  31. [31]

    Ryu, and Benjamin Grimmer

    TaeHo Yoon, Ernest K. Ryu, and Benjamin Grimmer. H-invariance theory: A complete characterization of minimax optimal fixed-point algorithms.arXiv:2511.14915, 2025

  32. [32]

    Q(1, N−2) Q(2, N−2) # ∈span (

    Zehao Zhang and Rujun Jiang. Accelerated gradient descent by concatenation of stepsize schedules.arXiv:2410.12395 (To Appear in SIAM Journal on Optimization), 2024. A Proof of Lemma 2 We first state and prove some useful combinatorial lemmas. Lemma 10(Chu–Vandermonde identity).For anyα, β∈Rand any integern≥0, nX m=0 α m β n−m = α+β n . Proof.Compare the c...