Advances and Open Problems in Federated Learning

Adri\`a Gasc\'on; Aleksandra Korolova; Ananda Theertha Suresh; Arjun Nitin Bhagoji; Aur\'elien Bellet; Ayfer \"Ozg\"ur; Badih Ghazi; Ben Hutchinson; Brendan Avent; Chaoyang He

arxiv: 1912.04977 · v3 · pith:WVFDDUZFnew · submitted 2019-12-10 · 💻 cs.LG · cs.CR· stat.ML

Advances and Open Problems in Federated Learning

Peter Kairouz , H. Brendan McMahan , Brendan Avent , Aur\'elien Bellet , Mehdi Bennis , Arjun Nitin Bhagoji , Kallista Bonawitz , Zachary Charles

show 51 more authors

Graham Cormode Rachel Cummings Rafael G.L. D'Oliveira Hubert Eichner Salim El Rouayheb David Evans Josh Gardner Zachary Garrett Adri\`a Gasc\'on Badih Ghazi Phillip B. Gibbons Marco Gruteser Zaid Harchaoui Chaoyang He Lie He Zhouyuan Huo Ben Hutchinson Justin Hsu Martin Jaggi Tara Javidi Gauri Joshi Mikhail Khodak Jakub Kone\v{c}n\'y Aleksandra Korolova Farinaz Koushanfar Sanmi Koyejo Tancr\`ede Lepoint Yang Liu Prateek Mittal Mehryar Mohri Richard Nock Ayfer \"Ozg\"ur Rasmus Pagh Mariana Raykova Hang Qi Daniel Ramage Ramesh Raskar Dawn Song Weikang Song Sebastian U. Stich Ziteng Sun Ananda Theertha Suresh Florian Tram\`er Praneeth Vepakomma Jianyu Wang Li Xiong Zheng Xu Qiang Yang Felix X. Yu Han Yu Sen Zhao

This is my paper

classification 💻 cs.LG cs.CRstat.ML

keywords learningdataadvancescollectionfederatedmachinemanyopen

0 comments

read the original abstract

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 16 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

TallyTrain: Communication-Efficient Federated Distillation
cs.LG 2026-06 unverdicted novelty 7.0

TallyTrain is a hard-label distillation protocol for federated learning that uses argmax transmission and optional sparse merges to match soft-label performance at up to 1000x lower communication cost.
Quantifying and Defending against the Privacy Risk in Logit-based Federated Learning
cs.CR 2026-06 unverdicted novelty 7.0

Logit-based federated learning leaks private model information to a semi-honest server via shared logits even with unrelated public data, enabling an adaptive stealing attack with theoretical bounds and a logit-pertur...
A Tight Theory of Error Feedback Algorithms in Distributed Optimization
cs.LG 2026-05 unverdicted novelty 7.0

Provides tight convergence analyses for EF and EF21 error feedback algorithms in distributed optimization, recovering single-agent rates independently of agent count.
Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge
cs.CV 2025-10 conditional novelty 7.0

The FedSurg challenge benchmarks federated learning on appendectomy videos and finds only 26% F1 on unseen centers even with centralized data, plus extra penalties from decentralization, with spatiotemporal models per...
Expected Gain-based Escalation in Vertical Federated Learning
cs.LG 2026-06 unverdicted novelty 6.0

An analytical expected-gain score from calibrated posteriors and classwise reliability estimates decides escalation in VFL, improving communication-accuracy trade-off over baselines.
Tuning-Free Efficient Estimation for Multi-Source Data via Covariance-Aware Shrinkage
stat.ME 2026-06 unverdicted novelty 6.0

Proposes a covariance-aware tuning-free shrinkage framework and sequential algorithm for multi-source estimation that attains oracle risk asymptotically and improves on single-step methods.
FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning
cs.LG 2025-11 conditional novelty 6.0

FLARE uses adaptive multi-dimensional reputation scores and soft exclusion to improve Byzantine robustness in federated learning by up to 16% over prior methods while handling a new Statistical Mimicry attack.
Compass: SLO-aware Query Planner for Compound AI Serving at Scale
cs.DB 2025-04 unverdicted novelty 6.0

Compass decomposes multi-query multi-SLO planning for compound AI serving, exploits plan similarities, uses selective profiling, and applies bipartite matching at runtime to deliver 2.4-5.1x higher goodput and 3.8-4.5...
Adaptive Federated Optimization
cs.LG 2020-02 unverdicted novelty 6.0

Proposes federated adaptive optimizers (FedAdagrad, FedAdam, FedYogi) with convergence analysis for non-convex objectives under data heterogeneity and reports empirical gains over FedAvg.
Adaptive Joint Compression and Synchronisation in Federated Split Learning for IoT Rainfall Prediction
cs.LG 2026-06 unverdicted novelty 5.0

A latency-driven scheduler jointly tunes activation compression and synchronization frequency in federated split learning, delivering up to 87% payload reduction and 54% less synchronization traffic on rainfall predic...
Choose Wisely and Privately: Proactive Client Selection for Fair and Efficient Federated Learning
cs.LG 2026-05 unverdicted novelty 5.0

Proposes proactive client selection via differentially private mutual information and Potential Federation Loss optimized by simulated annealing to achieve faster, fairer, and more accurate federated models than unifo...
Choose Wisely and Privately: Proactive Client Selection for Fair and Efficient Federated Learning
cs.LG 2026-05 unverdicted novelty 5.0

Proactive client selection in federated learning via differentially private mutual information and simulated annealing to optimize Potential Federation Loss for utility and fairness.
LADSG: Label-Anonymized Distillation and Similar Gradient Substitution for Label Privacy in Vertical Federated Learning
cs.CR 2025-06 unverdicted novelty 5.0

LADSG is a unified defense framework that reduces success rates of passive, active, and direct label inference attacks in VFL by 30-60% via label anonymization, gradient substitution, and norm-based filtering.
BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning
cs.LG 2024-07 unverdicted novelty 5.0

BoBa uses data distribution inference and overlapping clustering with voting to detect backdoor attacks in non-IID federated learning, claiming attack success rates below 0.001.
Understanding Communication Backends in Cross-Silo Federated Learning
cs.DC 2026-04 unverdicted novelty 4.0

Benchmarks of MPI, gRPC, and PyTorch RPC in cross-silo FL plus a new gRPC+S3 hybrid backend deliver up to 3.8x speedup for large-model transmission under realistic network conditions.
A Blueprint for AI-Driven Software Quality: Integrating LLMs with Established Standards
cs.SE 2025-05 unverdicted novelty 3.0

Survey mapping LLM applications in software quality assurance to established standards including ISO/IEC 12207, ISO 25010, CMMI, and TMM, with case studies, challenges, and future directions.