A novel decoupled method for distributed saddle problems achieves optimal communication complexity via multi-stage residual norm minimization, with a matching lower bound and extension to variational inequalities.
SIAM journal on control and optimization , volume=
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Scion is a new stochastic LMO-based optimizer family that unifies existing methods, supports unconstrained problems, and delivers hyperparameter transferability plus speedups on nanoGPT training.
Stationary duality reduces composite cardinality optimization to simple cardinality, yielding dual problems with equivalent local solutions and global solutions under appropriate parameter selection.
Decentralized SGD and SGDA under Markovian sampling admit non-asymptotic generalization bounds that incorporate network topology, Markov mixing rates, and primal-dual dynamics.
citing papers explorer
-
Efficient Gradient Methods for Distributed Saddle Problems
A novel decoupled method for distributed saddle problems achieves optimal communication complexity via multi-stage residual norm minimization, with a matching lower bound and extension to variational inequalities.
-
Training Deep Learning Models with Norm-Constrained LMOs
Scion is a new stochastic LMO-based optimizer family that unifies existing methods, supports unconstrained problems, and delivers hyperparameter transferability plus speedups on nanoGPT training.
-
On the Stationary Duality of Structural Composite Cardinality Optimization
Stationary duality reduces composite cardinality optimization to simple cardinality, yielding dual problems with equivalent local solutions and global solutions under appropriate parameter selection.
-
Stability and Generalization for Decentralized Markov SGD
Decentralized SGD and SGDA under Markovian sampling admit non-asymptotic generalization bounds that incorporate network topology, Markov mixing rates, and primal-dual dynamics.