High-probability generalization bounds for D-SGD are derived at the optimal rate O(1/sqrt(mn) log(1/δ)) via pointwise uniform stability across convex and non-convex settings.
1909.02712 , archivePrefix=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
The authors derive a clipped gradient tracking method with staggered variance reduction for RUC-regular finite-sum distributed optimization problems, establishing an O(∑ n_i^{1.5} + n_i^{0.5} ε^{-1}) complexity bound that relies only on local smoothness.
Decentralized SGD and SGDA under Markovian sampling admit non-asymptotic generalization bounds that incorporate network topology, Markov mixing rates, and primal-dual dynamics.
citing papers explorer
-
Unveiling High-Probability Generalization in Decentralized SGD
High-probability generalization bounds for D-SGD are derived at the optimal rate O(1/sqrt(mn) log(1/δ)) via pointwise uniform stability across convex and non-convex settings.
-
Clipped Stochastic Gradient Tracking For Locally Smooth Functions
The authors derive a clipped gradient tracking method with staggered variance reduction for RUC-regular finite-sum distributed optimization problems, establishing an O(∑ n_i^{1.5} + n_i^{0.5} ε^{-1}) complexity bound that relies only on local smoothness.
-
Stability and Generalization for Decentralized Markov SGD
Decentralized SGD and SGDA under Markovian sampling admit non-asymptotic generalization bounds that incorporate network topology, Markov mixing rates, and primal-dual dynamics.