Efficient Gradient Methods for Distributed Saddle Problems

Anton Rodomanov; Ruichen Luo; Sebastian U. Stich

arxiv: 2605.18604 · v1 · pith:ACEQWCZSnew · submitted 2026-05-18 · 🧮 math.OC · cs.DC· cs.MA

Efficient Gradient Methods for Distributed Saddle Problems

Ruichen Luo , Anton Rodomanov , Sebastian U. Stich This is my paper

Pith reviewed 2026-05-20 08:35 UTC · model grok-4.3

classification 🧮 math.OC cs.DCcs.MA

keywords distributed saddle point problemscommunication complexitygradient methodsvariational inequalitiesdecoupled optimizationzero-respecting algorithmsresidual norm minimizationmulti-agent systems

0 comments

The pith

A decoupled method achieves optimal communication cost for distributed saddle problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper formalizes the distributed setting for saddle point problems and defines precise measures of communication and computational cost. It presents a decoupled gradient method built on a multi-stage reduction to independent minimization of residual norms. This construction delivers better communication complexity than prior approaches including Extragradient while preserving standard oracle rates. A matching lower bound shows the method is optimal among all gradient-span algorithms in the zero-respecting framework. The same technique yields improved communication bounds when the setting is extended to variational inequality problems that model multiplayer general-sum games.

Core claim

Through a multi-stage reduction to the decoupled minimization of residual norms, the authors construct a gradient method for distributed saddle problems that achieves optimal communication cost in the zero-respecting framework. This strictly improves upon prior communication costs for the class and the oracle cost of Extragradient. A matching lower bound establishes that the method is communication-optimal among gradient-span algorithms. The approach further delivers a new state-of-the-art communication complexity for distributed variational inequality problems.

What carries the argument

The multi-stage reduction to decoupled minimization of residual norms, which separates the saddle problem into independent subproblems that can be solved with minimal inter-node communication.

If this is right

The method improves communication cost over the Extragradient method and other known algorithms for distributed saddle problems.
It is provably optimal among gradient-span algorithms via the matching lower bound.
The same technique gives a new state-of-the-art communication bound for distributed variational inequality problems.
The formal definitions of communication and computation costs enable direct comparison of future distributed methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The reduction technique may transfer to other distributed min-max settings such as federated robust optimization to lower network traffic.
Practical tests on multi-agent reinforcement learning tasks could reveal whether the communication savings translate to wall-clock gains.
If the zero-respecting model accurately captures real distributed hardware constraints, the optimality result would guide algorithm design beyond this paper.
Similar staged decoupling might apply to non-smooth or stochastic variants of saddle problems without extra coordination overhead.

Load-bearing premise

The multi-stage reduction to decoupled minimization of residual norms preserves optimal communication complexity without introducing hidden coordination costs or requiring stronger smoothness assumptions than those stated for the original saddle problem.

What would settle it

A concrete distributed saddle point instance in which the proposed method requires more communication rounds than the stated lower bound for gradient-span algorithms would disprove the optimality claim.

read the original abstract

The distributed setting for Saddle Problems (SPs) has recently emerged as a framework for various modern applications in machine learning and multiagent systems. Despite its relevance, the theoretical foundations of this setting have not yet been thoroughly established. In this paper, we advance this research direction by formalizing the distributed setup for SPs and providing rigorous definitions of communication and computational costs. Our main result is a novel decoupled method that achieves optimal communication cost within the zero-respecting framework. Our method is based on a multi-stage reduction to the decoupled minimization of residual norms, which yields strict improvements over the best known communication cost for the class and the long-standing oracle cost of the Extragradient method. Further, we show by a matching lower bound that our method is communication-optimal within the family of gradient-span algorithms. Finally, we study the extension of distributed SP into Variational Inequality Problem (VIP), which generalizes two-player zero-sum games to multiplayer general-sum games. We show that our decoupled method achieves a new state-of-the-art communication complexity for this broader class.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a decoupled multi-stage method for distributed saddle problems that improves communication complexity and matches a lower bound for gradient-span algorithms.

read the letter

The main thing to know is that this paper introduces a decoupled method for distributed saddle-point problems which reaches optimal communication cost within the gradient-span class, backed by a matching lower bound. They formalize the distributed setup with explicit definitions for communication and computation costs, then reduce the saddle problem across stages to decoupled residual-norm minimizations. This yields strict gains over extragradient and earlier distributed min-max solvers, and the same approach extends to variational inequalities for multiplayer general-sum games with new state-of-the-art bounds. The lower bound construction is the part that makes the optimality claim concrete rather than just an upper-bound improvement. The work stays inside standard gradient-based methods but tightens the communication analysis for the distributed case. The zero-respecting framework is used consistently, which is a standard restriction here and not a hidden flaw. One minor point worth checking in the full proofs is whether the multi-stage reduction adds any extra coordination steps that the complexity count might understate, though the abstract presents the costs as cleanly preserved. Overall the derivations look internally consistent with no obvious circularity. This is useful reading for people working on communication-efficient algorithms in distributed machine learning or multi-agent systems. Readers who care about tight bounds for min-max and variational inequality problems will get concrete new numbers to compare against. The paper shows clear engagement with the literature and has enough technical substance to go through peer review rather than a desk reject.

Referee Report

2 major / 3 minor

Summary. The manuscript formalizes the distributed setting for saddle-point problems (SPs), provides rigorous definitions of communication and computational costs, and introduces a novel decoupled gradient method based on a multi-stage reduction to the minimization of residual norms. The central claims are that this method achieves optimal communication cost within the zero-respecting framework, yields strict improvements over the Extragradient method and prior distributed SP bounds, is communication-optimal among gradient-span algorithms via a matching lower bound, and extends to variational inequality problems (VIPs) with new state-of-the-art communication complexity.

Significance. If the derivations and lower-bound construction hold, the work meaningfully advances the theoretical foundations of distributed saddle-point optimization, a setting relevant to machine learning and multi-agent systems. The combination of a constructive optimal method, a matching lower bound within the gradient-span class, and the VIP extension constitutes a substantive contribution; the explicit use of the zero-respecting framework and the multi-stage residual-norm reduction are particularly noteworthy strengths.

major comments (2)

[Technical development of the decoupled method] The multi-stage reduction to decoupled residual-norm minimization (described in the abstract and presumably detailed in the main technical sections) is load-bearing for the optimality claim; the manuscript should explicitly verify that each stage preserves the stated communication complexity without introducing hidden coordination rounds or requiring stronger smoothness assumptions than those given for the original SP.
[Lower-bound argument] The matching lower bound for communication optimality within gradient-span algorithms rests on an external construction; the reduction steps that map the SP instance to the lower-bound setting must be shown to incur no additional communication cost beyond the claimed bound.

minor comments (3)

[Abstract] The abstract states 'strict improvements over the best known communication cost for the class'; a brief quantitative comparison or explicit citation to the prior bound would improve clarity.
[Preliminaries] Notation for residual norms and the zero-respecting property should be introduced with a short self-contained definition in the preliminaries to aid readers unfamiliar with the framework.
[Numerical or complexity tables] Figure captions and table headers that summarize communication complexities should include the precise dependence on problem parameters (e.g., smoothness constants, dimension) for immediate readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and the recommendation for minor revision. The comments highlight important aspects of our technical development that we are happy to clarify and strengthen in the revised manuscript.

read point-by-point responses

Referee: The multi-stage reduction to decoupled residual-norm minimization (described in the abstract and presumably detailed in the main technical sections) is load-bearing for the optimality claim; the manuscript should explicitly verify that each stage preserves the stated communication complexity without introducing hidden coordination rounds or requiring stronger smoothness assumptions than those given for the original SP.

Authors: We appreciate this observation. In the revised version, we have added a dedicated subsection in Section 3.2 that explicitly verifies each stage of the multi-stage reduction. Specifically, we show that each stage involves only local gradient computations and a single round of communication for norm aggregation, without any additional coordination. Furthermore, the smoothness assumptions are unchanged from the original saddle-point problem, as the residual norm minimization is performed under the same Lipschitz conditions. This clarification ensures the communication complexity bound holds as stated. revision: yes
Referee: The matching lower bound for communication optimality within gradient-span algorithms rests on an external construction; the reduction steps that map the SP instance to the lower-bound setting must be shown to incur no additional communication cost beyond the claimed bound.

Authors: We agree that the reduction steps require explicit treatment. We have included a new appendix (Appendix C) that details the mapping from the distributed saddle-point instance to the lower-bound construction. This mapping is achieved through a direct embedding of the problem parameters, which requires only the initial distribution of data and incurs no extra communication rounds beyond those already accounted for in our complexity analysis. We demonstrate that the zero-respecting property is preserved throughout. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper formalizes the distributed saddle-point setting and defines communication/computational costs explicitly before presenting a novel decoupled method constructed via multi-stage reduction to residual-norm minimization. The claimed optimality within zero-respecting and gradient-span classes is supported by an explicit matching lower bound constructed separately in the work, rather than by re-deriving the method's own outputs or fitted quantities. No load-bearing step reduces by definition or self-citation to the inputs; the VIP extension follows directly from the same construction without circular dependencies. The derivation remains self-contained against the stated assumptions and external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions from convex optimization and distributed computing literature plus the novel reduction technique; no new entities are postulated.

axioms (2)

domain assumption The saddle point problem satisfies standard Lipschitz and monotonicity conditions that allow the residual norm minimization to be decoupled across stages.
Invoked to justify the multi-stage reduction without additional communication.
domain assumption Communication cost is measured strictly in the zero-respecting framework where only gradient information is exchanged.
Defines the model under which optimality is claimed.

pith-pipeline@v0.9.0 · 5716 in / 1316 out tokens · 32983 ms · 2026-05-20T08:35:28.186236+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1... T_DM-SP ≤ 2 + 4 L_xy D_x D_y / ε

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages

[1]

Mathematical Programming , volume=

Lower bounds for finding stationary points I , author=. Mathematical Programming , volume=. 2020 , publisher=

work page 2020
[2]

Decoupled

Ali Zindari and Parham Yazdkhasti and Anton Rodomanov and Tatjana Chavdarova and Sebastian U Stich , booktitle=. Decoupled. 2025 , url=

work page 2025
[3]

1983 , publisher=

Problem complexity and method efficiency in optimization , author=. 1983 , publisher=

work page 1983
[4]

arXiv preprint arXiv:2109.00534 , year=

The minimax complexity of distributed optimization , author=. arXiv preprint arXiv:2109.00534 , year=

work page arXiv
[5]

The Twelfth International Conference on Learning Representations , year=

Communication-Efficient Gradient Descent-Ascent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates , author=. The Twelfth International Conference on Learning Representations , year=

work page
[6]

Mathematical Programming , volume=

Minimizing finite sums with the stochastic average gradient , author=. Mathematical Programming , volume=. 2017 , publisher=

work page 2017
[7]

International Conference on Artificial Intelligence and Statistics , pages=

Local stochastic gradient descent ascent: Convergence analysis and communication efficiency , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

work page 2021
[8]

Optimization Methods and Software , pages=

Distributed saddle point problems: lower bounds, near-optimal and robust algorithms , author=. Optimization Methods and Software , pages=. 2025 , publisher=

work page 2025
[9]

Advances in Neural Information Processing Systems , volume=

A catalyst framework for minimax optimization , author=. Advances in Neural Information Processing Systems , volume=

work page
[10]

Mathematical Programming , pages=

A novel catalyst scheme for stochastic minimax optimization , author=. Mathematical Programming , pages=. 2026 , publisher=

work page 2026
[11]

Journal of Computational and Applied Mathematics , volume=

On linear convergence of iterative methods for the variational inequality problem , author=. Journal of Computational and Applied Mathematics , volume=. 1995 , publisher=

work page 1995
[12]

Conference on Learning Theory , pages=

Sharper rates for separable minimax and finite sum optimization via primal-dual extragradient methods , author=. Conference on Learning Theory , pages=. 2022 , organization=

work page 2022
[13]

Advances in neural information processing systems , volume=

Generative adversarial nets , author=. Advances in neural information processing systems , volume=

work page
[14]

Mathematical programming , volume=

Robust optimization--methodology and applications , author=. Mathematical programming , volume=. 2002 , publisher=

work page 2002
[15]

1947 , publisher=

Theory of games and economic behavior , author=. 1947 , publisher=

work page 1947
[16]

Journal of Complexity , volume=

Information-based complexity of linear operator equations , author=. Journal of Complexity , volume=. 1992 , publisher=

work page 1992
[17]

Optimization for Machine Learning , volume=

First order methods for nonsmooth convex large-scale optimization, ii: utilizing problems structure , author=. Optimization for Machine Learning , volume=. 2011 , publisher=

work page 2011
[18]

, author=

Multiagent reinforcement learning: theoretical framework and an algorithm. , author=. ICML , volume=

work page
[19]

Econometrica: Journal of the Econometric Society , pages=

Existence and uniqueness of equilibrium points for concave n-person games , author=. Econometrica: Journal of the Econometric Society , pages=. 1965 , publisher=

work page 1965
[20]

Communication complexity of

Babichenko, Yakov and Rubinstein, Aviad , booktitle=. Communication complexity of. 2020 , organization=

work page 2020
[21]

Communication complexity of approximate

Babichenko, Yakov and Rubinstein, Aviad , booktitle=. Communication complexity of approximate

work page
[22]

Journal of Economic Theory , volume=

The communication requirements of efficient allocations and supporting prices , author=. Journal of Economic Theory , volume=. 2006 , publisher=

work page 2006
[23]

Proceedings of the twenty-first international conference on Machine learning , pages=

Communication complexity as a lower bound for learning in games , author=. Proceedings of the twenty-first international conference on Machine learning , pages=

work page
[24]

Games and Economic Behavior , volume=

How long to equilibrium? The communication complexity of uncoupled equilibrium procedures , author=. Games and Economic Behavior , volume=. 2010 , publisher=

work page 2010
[25]

Artificial intelligence and statistics , pages=

Communication-efficient learning of deep networks from decentralized data , author=. Artificial intelligence and statistics , pages=. 2017 , organization=

work page 2017
[26]

Accelerated algorithms for smooth convex-concave minimax problems with O (1/k\^

Yoon, TaeHo and Ryu, Ernest K , booktitle=. Accelerated algorithms for smooth convex-concave minimax problems with O (1/k\^. 2021 , organization=

work page 2021
[27]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

Multiplayer Federated Learning: Reaching Equilibrium with Less Communication , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

work page
[28]

OPT 2025: Optimization for Machine Learning , year=

PEARL-Prox: Proximal Algorithm for Resolving Player Drift in Multiplayer Federated Learning , author=. OPT 2025: Optimization for Machine Learning , year=

work page 2025
[29]

Mathematical Programming , volume=

Accelerated proximal point method for maximally monotone operators , author=. Mathematical Programming , volume=. 2021 , publisher=

work page 2021
[30]

arXiv preprint arXiv:2410.14369 , year=

Extra-Gradient method with flexible anchoring: Strong convergence and fast residual decay , author=. arXiv preprint arXiv:2410.14369 , year=

work page arXiv
[31]

Foundations of Computational Mathematics , volume=

Fast optimistic gradient descent ascent (OGDA) method in continuous and discrete time , author=. Foundations of Computational Mathematics , volume=. 2025 , publisher=

work page 2025
[32]

Conference on learning theory , pages=

Near-optimal algorithms for minimax optimization , author=. Conference on learning theory , pages=. 2020 , organization=

work page 2020
[33]

Journal of mathematical imaging and vision , volume=

A first-order primal-dual algorithm for convex problems with applications to imaging , author=. Journal of mathematical imaging and vision , volume=. 2011 , publisher=

work page 2011
[34]

SIAM Journal on Optimization , volume=

Optimal primal-dual methods for a class of saddle point problems , author=. SIAM Journal on Optimization , volume=. 2014 , publisher=

work page 2014
[35]

Advances in Neural Information Processing Systems , volume=

Optimal extragradient-based algorithms for stochastic variational inequalities with separable structure , author=. Advances in Neural Information Processing Systems , volume=

work page
[36]

International conference on artificial intelligence and statistics , pages=

Lifted primal-dual method for bilinearly coupled smooth minimax optimization , author=. International conference on artificial intelligence and statistics , pages=. 2022 , organization=

work page 2022
[37]

SIAM Journal on Optimization , volume=

An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods , author=. SIAM Journal on Optimization , volume=. 2013 , publisher=

work page 2013
[38]

Advances in Neural Information Processing Systems , volume=

Stabilized proximal-point methods for federated optimization , author=. Advances in Neural Information Processing Systems , volume=

work page
[39]

Mathematical Programming , volume=

Accelerated schemes for a class of variational inequalities , author=. Mathematical Programming , volume=. 2017 , publisher=

work page 2017
[40]

arXiv preprint arXiv:2311.15154 , year=

High-order reduced-gradient methods for composite variational inequalities , author=. arXiv preprint arXiv:2311.15154 , year=

work page arXiv
[41]

Convergence analysis of a proximal-like minimization algorithm using

Chen, Gong and Teboulle, Marc , journal=. Convergence analysis of a proximal-like minimization algorithm using. 1993 , publisher=

work page 1993
[42]

Matecon , volume=

The extragradient method for finding saddle points and other problems , author=. Matecon , volume=

work page
[43]

Optimal and parameter-free gra- dient minimization methods for convex and nonconvex optimization.arXiv preprint arXiv:2310.12139, 2023

Optimal and parameter-free gradient minimization methods for convex and nonconvex optimization , author=. arXiv preprint arXiv:2310.12139 , year=

work page arXiv
[44]

, author=

A hybrid projection-proximal point algorithm. , author=. Journal of convex analysis , volume=. 1999 , publisher=

work page 1999
[45]

SIAM journal on control and optimization , volume=

Monotone operators and the proximal point algorithm , author=. SIAM journal on control and optimization , volume=. 1976 , publisher=

work page 1976
[46]

Dokl akad nauk Sssr , volume=

A method for solving the convex programming problem with convergence rate O (1/k2) , author=. Dokl akad nauk Sssr , volume=

work page
[47]

Prox-method with rate of convergence

Nemirovski, Arkadi , journal=. Prox-method with rate of convergence. 2004 , publisher=

work page 2004
[48]

Advances in Neural Information Processing Systems , volume=

Improved algorithms for convex-concave minimax optimization , author=. Advances in Neural Information Processing Systems , volume=

work page
[49]

SIAM Journal on Optimization , volume=

Contracting proximal methods for smooth convex optimization , author=. SIAM Journal on Optimization , volume=. 2020 , publisher=

work page 2020
[50]

Nesterov , title =

Yurii E. Nesterov , title =. 2004 , url =. doi:10.1007/978-1-4419-8853-9 , isbn =

work page doi:10.1007/978-1-4419-8853-9 2004
[51]

Mathematical Programming , volume=

On lower iteration complexity bounds for the convex concave saddle point problems , author=. Mathematical Programming , volume=. 2022 , publisher=

work page 2022
[52]

Computational Optimization and Applications , volume=

An accelerated minimax algorithm for convex-concave saddle point problems with nonsmooth coupling function , author=. Computational Optimization and Applications , volume=. 2023 , publisher=

work page 2023
[53]

2011 , publisher=

Optimization for machine learning , author=. 2011 , publisher=

work page 2011

[1] [1]

Mathematical Programming , volume=

Lower bounds for finding stationary points I , author=. Mathematical Programming , volume=. 2020 , publisher=

work page 2020

[2] [2]

Decoupled

Ali Zindari and Parham Yazdkhasti and Anton Rodomanov and Tatjana Chavdarova and Sebastian U Stich , booktitle=. Decoupled. 2025 , url=

work page 2025

[3] [3]

1983 , publisher=

Problem complexity and method efficiency in optimization , author=. 1983 , publisher=

work page 1983

[4] [4]

arXiv preprint arXiv:2109.00534 , year=

The minimax complexity of distributed optimization , author=. arXiv preprint arXiv:2109.00534 , year=

work page arXiv

[5] [5]

The Twelfth International Conference on Learning Representations , year=

Communication-Efficient Gradient Descent-Ascent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates , author=. The Twelfth International Conference on Learning Representations , year=

work page

[6] [6]

Mathematical Programming , volume=

Minimizing finite sums with the stochastic average gradient , author=. Mathematical Programming , volume=. 2017 , publisher=

work page 2017

[7] [7]

International Conference on Artificial Intelligence and Statistics , pages=

Local stochastic gradient descent ascent: Convergence analysis and communication efficiency , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

work page 2021

[8] [8]

Optimization Methods and Software , pages=

Distributed saddle point problems: lower bounds, near-optimal and robust algorithms , author=. Optimization Methods and Software , pages=. 2025 , publisher=

work page 2025

[9] [9]

Advances in Neural Information Processing Systems , volume=

A catalyst framework for minimax optimization , author=. Advances in Neural Information Processing Systems , volume=

work page

[10] [10]

Mathematical Programming , pages=

A novel catalyst scheme for stochastic minimax optimization , author=. Mathematical Programming , pages=. 2026 , publisher=

work page 2026

[11] [11]

Journal of Computational and Applied Mathematics , volume=

On linear convergence of iterative methods for the variational inequality problem , author=. Journal of Computational and Applied Mathematics , volume=. 1995 , publisher=

work page 1995

[12] [12]

Conference on Learning Theory , pages=

Sharper rates for separable minimax and finite sum optimization via primal-dual extragradient methods , author=. Conference on Learning Theory , pages=. 2022 , organization=

work page 2022

[13] [13]

Advances in neural information processing systems , volume=

Generative adversarial nets , author=. Advances in neural information processing systems , volume=

work page

[14] [14]

Mathematical programming , volume=

Robust optimization--methodology and applications , author=. Mathematical programming , volume=. 2002 , publisher=

work page 2002

[15] [15]

1947 , publisher=

Theory of games and economic behavior , author=. 1947 , publisher=

work page 1947

[16] [16]

Journal of Complexity , volume=

Information-based complexity of linear operator equations , author=. Journal of Complexity , volume=. 1992 , publisher=

work page 1992

[17] [17]

Optimization for Machine Learning , volume=

First order methods for nonsmooth convex large-scale optimization, ii: utilizing problems structure , author=. Optimization for Machine Learning , volume=. 2011 , publisher=

work page 2011

[18] [18]

, author=

Multiagent reinforcement learning: theoretical framework and an algorithm. , author=. ICML , volume=

work page

[19] [19]

Econometrica: Journal of the Econometric Society , pages=

Existence and uniqueness of equilibrium points for concave n-person games , author=. Econometrica: Journal of the Econometric Society , pages=. 1965 , publisher=

work page 1965

[20] [20]

Communication complexity of

Babichenko, Yakov and Rubinstein, Aviad , booktitle=. Communication complexity of. 2020 , organization=

work page 2020

[21] [21]

Communication complexity of approximate

Babichenko, Yakov and Rubinstein, Aviad , booktitle=. Communication complexity of approximate

work page

[22] [22]

Journal of Economic Theory , volume=

The communication requirements of efficient allocations and supporting prices , author=. Journal of Economic Theory , volume=. 2006 , publisher=

work page 2006

[23] [23]

Proceedings of the twenty-first international conference on Machine learning , pages=

Communication complexity as a lower bound for learning in games , author=. Proceedings of the twenty-first international conference on Machine learning , pages=

work page

[24] [24]

Games and Economic Behavior , volume=

How long to equilibrium? The communication complexity of uncoupled equilibrium procedures , author=. Games and Economic Behavior , volume=. 2010 , publisher=

work page 2010

[25] [25]

Artificial intelligence and statistics , pages=

Communication-efficient learning of deep networks from decentralized data , author=. Artificial intelligence and statistics , pages=. 2017 , organization=

work page 2017

[26] [26]

Accelerated algorithms for smooth convex-concave minimax problems with O (1/k\^

Yoon, TaeHo and Ryu, Ernest K , booktitle=. Accelerated algorithms for smooth convex-concave minimax problems with O (1/k\^. 2021 , organization=

work page 2021

[27] [27]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

Multiplayer Federated Learning: Reaching Equilibrium with Less Communication , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

work page

[28] [28]

OPT 2025: Optimization for Machine Learning , year=

PEARL-Prox: Proximal Algorithm for Resolving Player Drift in Multiplayer Federated Learning , author=. OPT 2025: Optimization for Machine Learning , year=

work page 2025

[29] [29]

Mathematical Programming , volume=

Accelerated proximal point method for maximally monotone operators , author=. Mathematical Programming , volume=. 2021 , publisher=

work page 2021

[30] [30]

arXiv preprint arXiv:2410.14369 , year=

Extra-Gradient method with flexible anchoring: Strong convergence and fast residual decay , author=. arXiv preprint arXiv:2410.14369 , year=

work page arXiv

[31] [31]

Foundations of Computational Mathematics , volume=

Fast optimistic gradient descent ascent (OGDA) method in continuous and discrete time , author=. Foundations of Computational Mathematics , volume=. 2025 , publisher=

work page 2025

[32] [32]

Conference on learning theory , pages=

Near-optimal algorithms for minimax optimization , author=. Conference on learning theory , pages=. 2020 , organization=

work page 2020

[33] [33]

Journal of mathematical imaging and vision , volume=

A first-order primal-dual algorithm for convex problems with applications to imaging , author=. Journal of mathematical imaging and vision , volume=. 2011 , publisher=

work page 2011

[34] [34]

SIAM Journal on Optimization , volume=

Optimal primal-dual methods for a class of saddle point problems , author=. SIAM Journal on Optimization , volume=. 2014 , publisher=

work page 2014

[35] [35]

Advances in Neural Information Processing Systems , volume=

Optimal extragradient-based algorithms for stochastic variational inequalities with separable structure , author=. Advances in Neural Information Processing Systems , volume=

work page

[36] [36]

International conference on artificial intelligence and statistics , pages=

Lifted primal-dual method for bilinearly coupled smooth minimax optimization , author=. International conference on artificial intelligence and statistics , pages=. 2022 , organization=

work page 2022

[37] [37]

SIAM Journal on Optimization , volume=

An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods , author=. SIAM Journal on Optimization , volume=. 2013 , publisher=

work page 2013

[38] [38]

Advances in Neural Information Processing Systems , volume=

Stabilized proximal-point methods for federated optimization , author=. Advances in Neural Information Processing Systems , volume=

work page

[39] [39]

Mathematical Programming , volume=

Accelerated schemes for a class of variational inequalities , author=. Mathematical Programming , volume=. 2017 , publisher=

work page 2017

[40] [40]

arXiv preprint arXiv:2311.15154 , year=

High-order reduced-gradient methods for composite variational inequalities , author=. arXiv preprint arXiv:2311.15154 , year=

work page arXiv

[41] [41]

Convergence analysis of a proximal-like minimization algorithm using

Chen, Gong and Teboulle, Marc , journal=. Convergence analysis of a proximal-like minimization algorithm using. 1993 , publisher=

work page 1993

[42] [42]

Matecon , volume=

The extragradient method for finding saddle points and other problems , author=. Matecon , volume=

work page

[43] [43]

Optimal and parameter-free gra- dient minimization methods for convex and nonconvex optimization.arXiv preprint arXiv:2310.12139, 2023

Optimal and parameter-free gradient minimization methods for convex and nonconvex optimization , author=. arXiv preprint arXiv:2310.12139 , year=

work page arXiv

[44] [44]

, author=

A hybrid projection-proximal point algorithm. , author=. Journal of convex analysis , volume=. 1999 , publisher=

work page 1999

[45] [45]

SIAM journal on control and optimization , volume=

Monotone operators and the proximal point algorithm , author=. SIAM journal on control and optimization , volume=. 1976 , publisher=

work page 1976

[46] [46]

Dokl akad nauk Sssr , volume=

A method for solving the convex programming problem with convergence rate O (1/k2) , author=. Dokl akad nauk Sssr , volume=

work page

[47] [47]

Prox-method with rate of convergence

Nemirovski, Arkadi , journal=. Prox-method with rate of convergence. 2004 , publisher=

work page 2004

[48] [48]

Advances in Neural Information Processing Systems , volume=

Improved algorithms for convex-concave minimax optimization , author=. Advances in Neural Information Processing Systems , volume=

work page

[49] [49]

SIAM Journal on Optimization , volume=

Contracting proximal methods for smooth convex optimization , author=. SIAM Journal on Optimization , volume=. 2020 , publisher=

work page 2020

[50] [50]

Nesterov , title =

Yurii E. Nesterov , title =. 2004 , url =. doi:10.1007/978-1-4419-8853-9 , isbn =

work page doi:10.1007/978-1-4419-8853-9 2004

[51] [51]

Mathematical Programming , volume=

On lower iteration complexity bounds for the convex concave saddle point problems , author=. Mathematical Programming , volume=. 2022 , publisher=

work page 2022

[52] [52]

Computational Optimization and Applications , volume=

An accelerated minimax algorithm for convex-concave saddle point problems with nonsmooth coupling function , author=. Computational Optimization and Applications , volume=. 2023 , publisher=

work page 2023

[53] [53]

2011 , publisher=

Optimization for machine learning , author=. 2011 , publisher=

work page 2011