A Field Guide to Federated Optimization
read the original abstract
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and other constraints that are not primary considerations in other problem settings. This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms through concrete examples and practical implementation, with a focus on conducting effective simulations to infer real-world performance. The goal of this work is not to survey the current literature, but to inspire researchers and practitioners to design federated learning algorithms that can be used in various practical applications.
This paper has not been read by Pith yet.
Forward citations
Cited by 14 Pith papers
-
LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging
LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.
-
Ringmaster LMO: Asynchronous Linear Minimization Oracle Momentum Method
Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.
-
Scalable Distributed Stochastic Optimization via Bidirectional Compression: Beyond Pessimistic Limits
Inkheart SGD and M4 use bidirectional compression to achieve time complexities in distributed SGD that improve with worker count n and surpass prior lower bounds under a necessary structural assumption.
-
Byzantine-Robust Distributed SGD: A Unified Analysis and Tight Error Bounds
Unified convergence rates and tight lower bounds for Byzantine-robust distributed SGD under stochasticity and general data heterogeneity, showing local momentum reduces stochastic error floors.
-
Automated Byzantine-Resilient Clustered Decentralized Federated Learning for Battery Intelligence in Connected EVs
ABC-DFL is a new decentralized federated learning framework replacing central servers with blockchain and using FLECA clustering aggregation to achieve Byzantine resilience for EV battery tasks.
-
Automated Byzantine-Resilient Clustered Decentralized Federated Learning for Battery Intelligence in Connected EVs
ABC-DFL replaces central FL servers with a permissioned blockchain and introduces FLECA for filtering malicious updates via adaptive thresholds and oracle-based clustering to achieve Byzantine-resilient decentralized ...
-
Response-Conditioned Parallel-to-Sequential Orchestration for Multi-Agent Systems
Nexa learns a response-conditioned policy that starts with parallel agent execution and adds at most one round of sequential message passing via a predicted sparse DAG, strictly subsuming pure parallel mode.
-
DisAgg: Distributed Aggregators for Efficient Secure Aggregation in Federated Learning
DisAgg distributes secure aggregation to a client committee via secret sharing, eliminating local masking and homomorphic encryption while preserving privacy and delivering 4.6x speedup over OPA for 100k clients and 1...
-
Decoupled DiLoCo for Resilient Distributed Pre-training
Decoupled DiLoCo enables asynchronous distributed pre-training with zero global downtime under simulated failures while preserving competitive performance on text and vision tasks.
-
Automated Byzantine-Resilient Clustered Decentralized Federated Learning for Battery Intelligence in Connected EVs
ABC-DFL replaces central FL servers with a permissioned blockchain and introduces FLECA, a hierarchical filtering-and-clustering aggregation protocol that keeps convergence close to FedProx while limiting attack impac...
-
Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.
-
Communication-Efficient Federated Fine-Tuning
FDA-Opt unifies and improves upon FedOpt and FDA for communication-efficient federated fine-tuning of language models on NLP tasks, outperforming optimized FedOpt baselines.
-
Entropy-Regularized Probabilistic Gates for Sparse Model Discovery in Scarce-Data Federated Learning
Entropy regularization of probabilistic gates improves test performance and sparsity recovery in scarce-data federated learning over Fed-IHT and FedAvg pruning.
-
A Survey on Foundation Models for Personalized Federated Intelligence
The survey introduces personalized federated intelligence (PFI) as a framework integrating federated learning and foundation models to support privacy-aware personalization of AI models.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.