arxiv: 2604.02762 · v1 · submitted 2026-04-03 · 🧮 math.OC · cs.SY· eess.SY

Recognition: 2 theorem links

· Lean Theorem

A Canonical Structure for Constructing Projected First-Order Algorithms With Delayed Feedback

Mengmou Li , Yu Zhou , Xun Shen , Masaaki Nagahara

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:22 UTC · model grok-4.3

classification 🧮 math.OC cs.SYeess.SY

keywords first-order algorithmsdelayed feedbackprojected optimizationLur'e representationLyapunov normconvergence ratesconstrained optimization

0 comments

The pith

A linear transformation turns first-order algorithms with delayed feedback into a form that projects onto constraints while preserving original convergence rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a canonical structure for unconstrained first-order optimization algorithms that can be expressed in Lur'e form, covering cases with delayed gradients and relative degree greater than one. This structure arises from a simple linear transformation of the algorithm dynamics. Once obtained, the structure supports a direct projection step onto convex sets using a norm induced by the algorithm's Lyapunov function. If the claim holds, many existing unconstrained methods gain constrained versions that still reach the optimum without any loss in convergence speed.

Core claim

The paper establishes that a broad class of unconstrained first-order algorithms admitting a Lur'e representation, including those with delayed gradient feedback, can be rewritten via a simple linear transformation into a canonical structure. This structure enables the construction of projected counterparts for set-constrained problems by performing the projection in the Lyapunov-induced norm. The resulting projected algorithms converge to the optimal solution while retaining exactly the convergence rates of their unconstrained versions.

What carries the argument

Canonical Lur'e structure derived by linear transformation, which carries the argument by allowing projection in the associated Lyapunov-induced norm.

If this is right

Projected versions of the algorithms reach the optimal solution for convex set-constrained problems.
Convergence rates remain identical to those of the original unconstrained algorithms.
The construction applies directly to methods with relative degree greater than one, such as those using delayed gradients.
A systematic extension from unconstrained to constrained optimization is available for the entire class.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Control engineers could apply the same linear transformation step to embed constraints into real-time feedback optimization loops.
The approach invites direct numerical checks on standard test functions with added delays to measure whether observed rates match theory.
Similar canonical forms might later be derived for algorithms that incorporate stochastic gradients or other perturbations.

Load-bearing premise

Unconstrained first-order algorithms with delayed feedback admit a Lur'e representation that arises from a simple linear transformation, even when relative degree exceeds one.

What would settle it

A concrete counterexample where a specific delayed-feedback algorithm, after the linear transformation and projection step, either fails to reach the optimum or converges strictly slower than its unconstrained counterpart would disprove the claim.

Figures

Figures reproduced from arXiv: 2604.02762 by Masaaki Nagahara, Mengmou Li, Xun Shen, Yu Zhou.

read the original abstract

This work introduces a canonical structure for a broad class of unconstrained first-order algorithms that admit a Lur'e representation, including systems with relative degree greater than one, e.g., systems with delayed gradient feedback. The proposed canonical structure is obtained through a simple linear transformation. It enables a direct extension from unconstrained optimization algorithms to set-constrained ones through projection in a Lyapunov-induced norm. The resulting projected algorithms attain the optimal solution while preserving the convergence rates of their unconstrained counterparts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A linear transformation yields a canonical Lur'e structure for projecting delayed first-order methods while preserving rates.

read the letter

This paper offers a linear transformation to create a canonical Lur'e structure for first-order algorithms with delays, which then supports projection onto constraints without losing convergence rates. It handles the relative degree greater than one case, which is common in delayed gradient feedback, and claims the projected versions reach the optimum at the same rate as the unconstrained ones. That's a practical extension for constrained optimization under lags. The use of the Lyapunov-induced norm for the projection step is a nice touch that keeps things consistent with the original analysis. The approach seems sound in principle because the projection is defined directly from the Lyapunov function, avoiding extra tuning. Credit to the authors for organizing a broad class this way rather than case-by-case. One soft spot is the assumption that the transformation applies broadly without hidden limits on delay size or step-size choices. The paper needs to show clearly that this holds for the claimed class and that no rate degradation sneaks in during projection. If there are counterexamples or additional conditions, they should be stated upfront. Readers working on accelerated first-order methods or control with communication delays would find this useful as a template. It deserves a serious referee because the framework is new and the claims are testable with the math provided.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a canonical Lur'e structure for a broad class of unconstrained first-order optimization algorithms that admit a Lur'e representation, including systems with relative degree greater than one such as those with delayed gradient feedback. The structure is obtained via a simple linear transformation. This enables direct construction of projected first-order algorithms for set-constrained problems by performing the projection step in the norm induced by the Lyapunov function of the unconstrained algorithm. The resulting projected methods are claimed to reach the optimal solution while exactly preserving the convergence rates of the original unconstrained algorithms.

Significance. If the linear transformation and the associated projection construction are rigorously verified, the work supplies a unified framework for extending many existing first-order methods (including delayed-feedback variants) to constrained settings without rate loss. This would be useful in distributed optimization and control applications where delays arise naturally, and the reliance on an existing Lyapunov function could simplify analysis compared with ad-hoc projection designs.

major comments (2)

[Section 3] The abstract asserts that a simple linear transformation always produces a canonical Lur'e form even for relative degree >1, but the manuscript must supply the explicit transformation matrices and verify that the resulting system remains in Lur'e form with the same Lyapunov function; without this derivation the central extension to projected algorithms cannot be assessed.
[Section 4] The claim that projection in the Lyapunov-induced norm preserves both optimality and the original convergence rate (stated in the abstract) requires an explicit proof that the projected update still satisfies the Lyapunov decrease inequality; this step is load-bearing for the main result and must appear in the main theorem (likely Theorem 1 or equivalent).

minor comments (2)

[Section 2] Clarify the precise definition of the Lur'e representation used for delayed systems and state any assumptions on the delay size or step-size that are required for the transformation to be well-defined.
[Section 5] Add a concrete low-dimensional example (e.g., gradient descent with one-step delay) showing the transformation matrices and the resulting projected iteration to illustrate the construction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which have helped clarify the presentation of the canonical Lur'e structure and its extension to projected algorithms. We address each major comment below and have revised the manuscript to incorporate the requested derivations and proofs.

read point-by-point responses

Referee: [Section 3] The abstract asserts that a simple linear transformation always produces a canonical Lur'e form even for relative degree >1, but the manuscript must supply the explicit transformation matrices and verify that the resulting system remains in Lur'e form with the same Lyapunov function; without this derivation the central extension to projected algorithms cannot be assessed.

Authors: We agree that the explicit transformation matrices are required for rigorous verification. In the revised manuscript we have added the explicit linear transformation matrices in Section 3, together with a direct verification that the transformed system remains in canonical Lur'e form and that the original Lyapunov function is preserved. This derivation now supports the subsequent construction of the projected algorithms. revision: yes
Referee: [Section 4] The claim that projection in the Lyapunov-induced norm preserves both optimality and the original convergence rate (stated in the abstract) requires an explicit proof that the projected update still satisfies the Lyapunov decrease inequality; this step is load-bearing for the main result and must appear in the main theorem (likely Theorem 1 or equivalent).

Authors: We thank the referee for highlighting this critical step. The revised manuscript now contains an explicit proof inside the main theorem (Theorem 1) showing that the projected update in the Lyapunov-induced norm continues to satisfy the Lyapunov decrease inequality. Consequently, both optimality of the limit point and preservation of the original convergence rate are rigorously established. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper constructs a canonical Lur'e structure via an explicit linear transformation applied to a broad class of first-order algorithms (including delayed-feedback cases with relative degree >1). This transformation is presented as always feasible by assumption on the class, after which projection is performed directly in the induced Lyapunov norm. The preservation of optimality and convergence rate follows from the norm choice and the projection definition rather than from any fitted parameter or self-referential redefinition. No load-bearing self-citation, uniqueness theorem imported from the authors' prior work, or ansatz smuggled via citation is required; the steps remain independent of the target result. The derivation is therefore self-contained against external benchmarks for the stated class.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the algorithms admit a Lur'e representation obtainable by linear transformation; no free parameters, invented entities, or additional axioms are visible in the abstract.

axioms (1)

domain assumption Unconstrained first-order algorithms with delayed feedback admit a Lur'e representation obtainable via simple linear transformation, including for relative degree greater than one.
This is the explicit premise stated in the abstract that enables the canonical structure and subsequent projection.

pith-pipeline@v0.9.0 · 5375 in / 1209 out tokens · 47430 ms · 2026-05-13T19:22:41.090676+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1: System (5) with relative degree r≥1 can be rewritten as [canonical block form with Ã11, Ã12, B̃1, C̃1]
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 4: projected algorithm (22) linearly converges at the same rate ρ as the unconstrained counterpart

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

Variance- reduced methods for machine learning,

R. M. Gower, M. Schmidt, F. Bach, and P. Richt ´arik, “Variance- reduced methods for machine learning,”Proceedings of the IEEE, vol. 108, no. 11, pp. 1968–1983, 2020

work page 1968
[2]

Accelerated first-order optimization algorithms for machine learning,

H. Li, C. Fang, and Z. Lin, “Accelerated first-order optimization algorithms for machine learning,”Proceedings of the IEEE, vol. 108, no. 11, pp. 2067–2082, 2020

work page 2067
[3]

A survey of distributed optimization and control algorithms for electric power systems,

D. K. Molzahn, F. D ¨orfler, H. Sandberg, S. H. Low, S. Chakrabarti, R. Baldick, and J. Lavaei, “A survey of distributed optimization and control algorithms for electric power systems,”IEEE Transactions on Smart Grid, vol. 8, no. 6, pp. 2941–2962, 2017

work page 2017
[4]

Distributed optimal secondary frequency control in power networks with delay independent stability,

M. Li, J. Watson, and I. Lestas, “Distributed optimal secondary frequency control in power networks with delay independent stability,” IEEE Transactions on Automatic Control, vol. 69, no. 6, pp. 3748– 3763, 2023

work page 2023
[5]

Nesterov,Introductory lectures on convex optimization: A basic course

Y . Nesterov,Introductory lectures on convex optimization: A basic course. Springer Science & Business Media, 2003, vol. 87

work page 2003
[6]

Boyd and L

S. Boyd and L. Vandenberghe,Convex optimization. Cambridge University Press, 2004

work page 2004
[7]

Acceleration methods,

A. d’Aspremont, D. Scieur, and A. Taylor, “Acceleration methods,” F oundations and Trends in Optimization, vol. 5, no. 1-2, pp. 1–245, 2021

work page 2021
[8]

System analysis via integral quadratic constraints,

A. Megretski and A. Rantzer, “System analysis via integral quadratic constraints,”IEEE Transactions on Automatic Control, vol. 42, no. 6, pp. 819–830, 1997

work page 1997
[9]

Stability analysis with dissipation inequalities and inte- gral quadratic constraints,

P. Seiler, “Stability analysis with dissipation inequalities and inte- gral quadratic constraints,”IEEE Transactions on Automatic Control, vol. 60, no. 6, pp. 1704–1709, 2014

work page 2014
[10]

Analysis and design of opti- mization algorithms via integral quadratic constraints,

L. Lessard, B. Recht, and A. Packard, “Analysis and design of opti- mization algorithms via integral quadratic constraints,”SIAM Journal on Optimization, vol. 26, no. 1, pp. 57–95, 2016

work page 2016
[11]

Zames–Falb multipliers for con- vergence rate: motivating example and convex searches,

J. Zhang, P. Seiler, and J. Carrasco, “Zames–Falb multipliers for con- vergence rate: motivating example and convex searches,”International Journal of Control, vol. 95, no. 3, pp. 821–829, 2022

work page 2022
[12]

Convex synthesis of accelerated gradi- ent algorithms,

C. Scherer and C. Ebenbauer, “Convex synthesis of accelerated gradi- ent algorithms,”SIAM Journal on Control and Optimization, vol. 59, no. 6, pp. 4615–4645, 2021

work page 2021
[13]

Optimization algorithm synthesis based on integral quadratic constraints: A tutorial,

C. W. Scherer, C. Ebenbauer, and T. Holicki, “Optimization algorithm synthesis based on integral quadratic constraints: A tutorial,” in2023 62nd IEEE Conference on Decision and Control (CDC). IEEE, 2023, pp. 2995–3002

work page 2023
[14]

A tutorial on convex design of optimization algorithms by integral quadratic constraints,

C. W. Scherer and C. Ebenbauer, “A tutorial on convex design of optimization algorithms by integral quadratic constraints,”Annual Review of Control, Robotics, and Autonomous Systems, vol. 9, 2025

work page 2025
[15]

The fastest known globally convergent first-order method for minimizing strongly convex functions,

B. Van Scoy, R. A. Freeman, and K. M. Lynch, “The fastest known globally convergent first-order method for minimizing strongly convex functions,”IEEE Control Systems Letters, vol. 2, no. 1, pp. 49–54, 2017

work page 2017
[16]

Exponential stability analysis via integral quadratic constraints,

R. Boczar, L. Lessard, A. Packard, and B. Recht, “Exponential stability analysis via integral quadratic constraints,”arXiv preprint arXiv:1706.01337, 2017

work page arXiv 2017
[17]

Exponential decay rate conditions for uncertain linear systems using integral quadratic constraints,

B. Hu and P. Seiler, “Exponential decay rate conditions for uncertain linear systems using integral quadratic constraints,”IEEE Transactions on Automatic Control, vol. 61, no. 11, pp. 3631–3637, 2016

work page 2016
[18]

All multipliers for repeated monotone nonlinearities,

V . V . Kulkarni and M. G. Safonov, “All multipliers for repeated monotone nonlinearities,”IEEE Transactions on Automatic Control, vol. 47, no. 7, pp. 1209–1212, 2002

work page 2002
[19]

Equivalence between classes of multipliers for slope-restricted nonlinearities,

J. Carrasco, W. P. Heath, and A. Lanzon, “Equivalence between classes of multipliers for slope-restricted nonlinearities,”Automatica, vol. 49, no. 6, pp. 1732–1740, 2013

work page 2013
[20]

Zames–Falb multipliers for absolute stability: From O’Shea’s contribution to convex searches,

J. Carrasco, M. C. Turner, and W. P. Heath, “Zames–Falb multipliers for absolute stability: From O’Shea’s contribution to convex searches,” European Journal of Control, vol. 28, pp. 1–19, 2016

work page 2016
[21]

Absolute stability analysis of discrete time feedback interconnections,

M. Fetzer and C. W. Scherer, “Absolute stability analysis of discrete time feedback interconnections,”IF AC-PapersOnLine, vol. 50, no. 1, pp. 8447–8453, 2017

work page 2017
[22]

Convex searches for discrete-time Zames–Falb multipliers,

J. Carrasco, W. P. Heath, J. Zhang, N. S. Ahmad, and S. Wang, “Convex searches for discrete-time Zames–Falb multipliers,”IEEE Transactions on Automatic Control, vol. 65, no. 11, pp. 4538–4553, 2019

work page 2019
[23]

Absolute stability via lifting and interpolation,

B. Van Scoy and L. Lessard, “Absolute stability via lifting and interpolation,” in2022 IEEE 61st Conference on Decision and Control (CDC). IEEE, 2022, pp. 6217–6223

work page 2022
[24]

On the necessity and sufficiency of discrete-time O’Shea–Zames–Falb multipliers,

L. Su, P. Seiler, J. Carrasco, and S. Z. Khong, “On the necessity and sufficiency of discrete-time O’Shea–Zames–Falb multipliers,” Automatica, vol. 150, p. 110872, 2023

work page 2023
[25]

On dual of LMIs for absolute stability analysis of nonlinear feedback systems with static O’Shea–Zames–Falb multipliers,

H. Gyotoku, T. Yuno, Y . Ebihara, V . Magron, D. Peaucelle, and S. Tar- bouriech, “On dual of LMIs for absolute stability analysis of nonlinear feedback systems with static O’Shea–Zames–Falb multipliers,”arXiv preprint arXiv:2411.14339, 2024

work page arXiv 2024
[26]

On the generalization of the multivariable Popov criterion for slope-restricted nonlinearities,

M. Li, T. Hatanaka, and M. Nagahara, “On the generalization of the multivariable Popov criterion for slope-restricted nonlinearities,” in 2024 63rd IEEE Conference on Decision and Control (CDC), 2024, to appear

work page 2024
[27]

Exact worst-case per- formance of first-order methods for composite convex optimization,

A. B. Taylor, J. M. Hendrickx, and F. Glineur, “Exact worst-case per- formance of first-order methods for composite convex optimization,” SIAM Journal on Optimization, vol. 27, no. 3, pp. 1283–1313, 2017

work page 2017
[28]

Smooth strongly convex interpolation and exact worst-case performance of first-order methods,

——, “Smooth strongly convex interpolation and exact worst-case performance of first-order methods,”Mathematical Programming, vol. 161, pp. 307–345, 2017

work page 2017
[29]

Performance estimation toolbox (PESTO): Automated worst- case analysis of first-order optimization methods,

——, “Performance estimation toolbox (PESTO): Automated worst- case analysis of first-order optimization methods,” in2017 IEEE 56th Annual Conference on Decision and Control (CDC). IEEE, 2017, pp. 1278–1283

work page 2017
[30]

The analysis of optimization algorithms: A dissipativity approach,

L. Lessard, “The analysis of optimization algorithms: A dissipativity approach,”IEEE Control Systems Magazine, vol. 42, no. 3, pp. 58–72, 2022

work page 2022
[31]

Convergence rate bounds for the mirror descent method: IQCs, Popov criterion and Bregman divergence,

M. Li, K. Laib, T. Hatanaka, and I. Lestas, “Convergence rate bounds for the mirror descent method: IQCs, Popov criterion and Bregman divergence,”Automatica, vol. 171, p. 111973, 2025

work page 2025
[32]

Structure, analysis, and synthesis of first-order algorithms,

J. Miller, C. Scherer, F. Jakob, and A. Iannelli, “Structure, analysis, and synthesis of first-order algorithms,”arXiv preprint arXiv:2603.24795, 2026

work page arXiv 2026
[33]

First-order projected algorithms with the same linear convergence rate bounds as their unconstrained counterparts,

M. Li, I. Lestas, and M. Nagahara, “First-order projected algorithms with the same linear convergence rate bounds as their unconstrained counterparts,”arXiv preprint arXiv:2503.13965, 2025

work page arXiv 2025
[34]

Asynchronous stochastic gradient descent with delay compen- sation,

S. Zheng, Q. Meng, T. Wang, W. Chen, N. Yu, Z.-M. Ma, and T.-Y . Liu, “Asynchronous stochastic gradient descent with delay compen- sation,” inInternational Conference on Machine Learning. PMLR, 2017, pp. 4120–4129

work page 2017
[35]

Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the polyak–łojasiewicz condition,

H. J. Choi, W. Choi, and J. Seok, “Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the polyak–łojasiewicz condition,”Analysis and Applications, vol. 22, no. 06, pp. 1023–1051, 2024

work page 2024
[36]

Passivity-based dis- tributed optimization with communication delays using pi consensus algorithm,

T. Hatanaka, N. Chopra, T. Ishizaki, and N. Li, “Passivity-based dis- tributed optimization with communication delays using pi consensus algorithm,”IEEE Transactions on Automatic Control, vol. 63, no. 12, pp. 4421–4428, 2018

work page 2018
[37]

Smooth dynamics for distributed constrained optimization with heterogeneous delays,

M. Li, S. Yamashita, T. Hatanaka, and G. Chesi, “Smooth dynamics for distributed constrained optimization with heterogeneous delays,” IEEE Control Systems Letters, vol. 4, no. 3, pp. 626–631, 2020

work page 2020
[38]

Ruszczynski,Nonlinear Optimization

A. Ruszczynski,Nonlinear Optimization. Princeton University Press, 2011

work page 2011
[39]

H. H. Bauschke and P. L. Combettes,Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, 2017

work page 2017
[40]

K. Zhou, J. C. Doyle, and K. Glover,Robust and optimal control. New Jersey: Prentice Hall, 1996, vol. 40

work page 1996
[41]

CarstenScherer/Algorithm-Synthesis,

C. W. Scherer, “CarstenScherer/Algorithm-Synthesis,” https://zenodo. org/badge/latestdoi/691960972, Sep. 2023, software, released 2023- 09-15

work page arXiv 2023