Local Adaptivity in Federated Learning: Convergence and Consistency

Gauri Joshi; Jianyu Wang; Luyang Liu; Zachary Charles; Zachary Garrett; Zheng Xu

arxiv: 2106.02305 · v1 · pith:PPV7JCRNnew · submitted 2021-06-04 · 💻 cs.LG · cs.DC· stat.ML

Local Adaptivity in Federated Learning: Convergence and Consistency

Jianyu Wang , Zheng Xu , Zachary Garrett , Zachary Charles , Luyang Liu , Gauri Joshi This is my paper

classification 💻 cs.LG cs.DCstat.ML

keywords localadaptivemethodsupdatesconvergencefederatedlearningoptimization

0 comments

read the original abstract

The federated learning (FL) framework trains a machine learning model using decentralized data stored at edge client devices by periodically aggregating locally trained models. Popular optimization algorithms of FL use vanilla (stochastic) gradient descent for both local updates at clients and global updates at the aggregating server. Recently, adaptive optimization methods such as AdaGrad have been studied for server updates. However, the effect of using adaptive optimization methods for local updates at clients is not yet understood. We show in both theory and practice that while local adaptive methods can accelerate convergence, they can cause a non-vanishing solution bias, where the final converged solution may be different from the stationary point of the global objective function. We propose correction techniques to overcome this inconsistency and complement the local adaptive methods for FL. Extensive experiments on realistic federated training tasks show that the proposed algorithms can achieve faster convergence and higher test accuracy than the baselines without local adaptivity.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Rethinking the Personalized Relaxed Initialization in the Federated Learning: Consistency and Generalization
cs.LG 2026-04 unverdicted novelty 4.0

FedInit uses reverse personalized initialization in FL to reduce client drift effects, showing via excess risk that inconsistency impacts generalization error more than optimization error.