LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.
hub
Adaptive Federated Optimization
21 Pith papers cite this work. Polarity classification is still indexing.
abstract
Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. Standard federated optimization methods such as Federated Averaging (FedAvg) are often difficult to tune and exhibit unfavorable convergence behavior. In non-federated settings, adaptive optimization methods have had notable success in combating such issues. In this work, we propose federated versions of adaptive optimizers, including Adagrad, Adam, and Yogi, and analyze their convergence in the presence of heterogeneous data for general non-convex settings. Our results highlight the interplay between client heterogeneity and communication efficiency. We also perform extensive experiments on these methods and show that the use of adaptive optimizers can significantly improve the performance of federated learning.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.
Inkheart SGD and M4 use bidirectional compression to achieve time complexities in distributed SGD that improve with worker count n and surpass prior lower bounds under a necessary structural assumption.
FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.
Split-MoPE integrates split learning with predefined-expert routing to maximize usable data in vertical federated learning under sample misalignment, delivering state-of-the-art accuracy in one communication round plus built-in robustness and per-sample contribution scores.
The FedSurg challenge benchmarks federated learning on appendectomy videos and finds only 26% F1 on unseen centers even with centralized data, plus extra penalties from decentralization, with spatiotemporal models performing best.
Presents new synchronous and asynchronous parallel approaches for GCP tensor decomposition and evaluates computational cost and accuracy on synthetic and real-world datasets.
Introduces FedHybrid and FedNewton for DP federated M-estimation, with finite-sample MSE bounds, minimax lower bound, and evaluations on vision datasets.
FedSDR augments federated self-distillation with dual LoRA streams (local smoothing and global rectification) to produce globally aligned, factually faithful models under statistical heterogeneity.
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.
SPEAR enables online federated LLM fine-tuning by using feedback-guided self-play to create contrastive pairs trained with maximum likelihood on correct completions and confidence-weighted unlikelihood on incorrect ones, outperforming baselines without ground-truth contexts.
FedFrozen improves stability in heterogeneous federated Transformer training by warming up the full model then freezing the attention kernel (query/key) while optimizing the value block under a fixed kernel.
FMCL performs one-shot class-aware client clustering in heterogeneous federated learning by deriving semantic signatures from foundation model embeddings and using cosine distance, yielding improved performance and stable clusters compared to prior methods.
PubSwap uses a small public dataset for selective off-policy response swapping in federated RLVR to improve coordination and performance over standard baselines on math and medical reasoning tasks.
Loss-based clustering of clients enables robust federated learning against strong Byzantine attacks with bounded optimality gaps using only the server and one honest client.
FedShield-LLM integrates pruning and FHE on LoRA parameters to support secure, scalable federated fine-tuning of LLMs such as Llama-2.
FedInit uses reverse personalized initialization in FL to reduce client drift effects, showing via excess risk that inconsistency impacts generalization error more than optimization error.
Decentralized optimization can reach optimal solutions in fewer iterations than centralized methods for machine learning tasks.
Federated aggregation strategies show distinct performance trade-offs in accuracy, loss, and efficiency depending on whether client data distributions are homogeneous or heterogeneous.
Hierarchical federated learning for plant-disease classification shows distinct accuracy-versus-energy trade-offs across EfficientNet-B0, ResNet-50, and MobileNetV3-Large paired with FedAvg, FedProx, and FedAvgM.
citing papers explorer
-
LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging
LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.
-
Ringmaster LMO: Asynchronous Linear Minimization Oracle Momentum Method
Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.
-
Scalable Distributed Stochastic Optimization via Bidirectional Compression: Beyond Pessimistic Limits
Inkheart SGD and M4 use bidirectional compression to achieve time complexities in distributed SGD that improve with worker count n and surpass prior lower bounds under a necessary structural assumption.
-
FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning
FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.
-
Mixture of Predefined Experts: Maximizing Data Usage on Vertical Federated Learning
Split-MoPE integrates split learning with predefined-expert routing to maximize usable data in vertical federated learning under sample misalignment, delivering state-of-the-art accuracy in one communication round plus built-in robustness and per-sample contribution scores.
-
Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge
The FedSurg challenge benchmarks federated learning on appendectomy videos and finds only 26% F1 on unseen centers even with centralized data, plus extra penalties from decentralization, with spatiotemporal models performing best.
-
Synchronous and Asynchronous Parallelism Approaches for Generalized Canonical Polyadic Tensor Decomposition with GenTen
Presents new synchronous and asynchronous parallel approaches for GCP tensor decomposition and evaluates computational cost and accuracy on synthetic and real-world datasets.
-
Statistical Limits and Efficient Algorithms for Differentially Private Federated Learning
Introduces FedHybrid and FedNewton for DP federated M-estimation, with finite-sample MSE bounds, minimax lower bound, and evaluations on vision datasets.
-
FedSDR: Federated Self-Distillation with Rectification
FedSDR augments federated self-distillation with dual LoRA streams (local smoothing and global rectification) to produce globally aligned, factually faithful models under statistical heterogeneity.
-
Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.
-
Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback
SPEAR enables online federated LLM fine-tuning by using feedback-guided self-play to create contrastive pairs trained with maximum likelihood on correct completions and confidence-weighted unlikelihood on incorrect ones, outperforming baselines without ground-truth contexts.
-
FedFrozen: Two-Stage Federated Optimization via Attention Kernel Freezing
FedFrozen improves stability in heterogeneous federated Transformer training by warming up the full model then freezing the attention kernel (query/key) while optimizing the value block under a fixed kernel.
-
FMCL: Class-Aware Client Clustering with Foundation Model Representations for Heterogeneous Federated Learning
FMCL performs one-shot class-aware client clustering in heterogeneous federated learning by deriving semantic signatures from foundation model embeddings and using cosine distance, yielding improved performance and stable clusters compared to prior methods.
-
PubSwap: Public-Data Off-Policy Coordination for Federated RLVR
PubSwap uses a small public dataset for selective off-policy response swapping in federated RLVR to improve coordination and performance over standard baselines on math and medical reasoning tasks.
-
Robust Federated Learning under Adversarial Attacks via Loss-Based Client Clustering
Loss-based clustering of clients enables robust federated learning against strong Byzantine attacks with bounded optimality gaps using only the server and one honest client.
-
FedShield-LLM: A Secure and Scalable Federated Fine-Tuned Large Language Model
FedShield-LLM integrates pruning and FHE on LoRA parameters to support secure, scalable federated fine-tuning of LLMs such as Llama-2.
-
Rethinking the Personalized Relaxed Initialization in the Federated Learning: Consistency and Generalization
FedInit uses reverse personalized initialization in FL to reduce client drift effects, showing via excess risk that inconsistency impacts generalization error more than optimization error.
-
Accelerating Optimization and Machine Learning through Decentralization
Decentralized optimization can reach optimal solutions in fewer iterations than centralized methods for machine learning tasks.
-
A Comparative Study of Federated Learning Aggregation Strategies under Homogeneous and Heterogeneous Data Distributions
Federated aggregation strategies show distinct performance trade-offs in accuracy, loss, and efficiency depending on whether client data distributions are homogeneous or heterogeneous.
-
Performance and Energy Trade-Off Analysis of Hierarchical Federated Learning for Plant Disease Classification
Hierarchical federated learning for plant-disease classification shows distinct accuracy-versus-energy trade-offs across EfficientNet-B0, ResNet-50, and MobileNetV3-Large paired with FedAvg, FedProx, and FedAvgM.
- FedQueue: Queue-Aware Federated Learning for Cross-Facility HPC Training