FedXDS uses propagation-based attribution to identify task-relevant features for selective data sharing in federated learning, yielding higher accuracy and faster convergence under heterogeneity with formal privacy guarantees.
hub Canonical reference
Federated Learning with Non-IID Data
Canonical reference. 75% of citing Pith papers cite this work as background.
abstract
Federated learning enables resource-constrained edge compute devices, such as mobile phones and IoT devices, to learn a shared model for prediction, while keeping the training data local. This decentralized approach to train models provides privacy, security, regulatory and economic benefits. In this work, we focus on the statistical challenge of federated learning when local data is non-IID. We first show that the accuracy of federated learning reduces significantly, by up to 55% for neural networks trained for highly skewed non-IID data, where each client device trains only on a single class of data. We further show that this accuracy reduction can be explained by the weight divergence, which can be quantified by the earth mover's distance (EMD) between the distribution over classes on each device and the population distribution. As a solution, we propose a strategy to improve training on non-IID data by creating a small subset of data which is globally shared between all the edge devices. Experiments show that accuracy can be increased by 30% for the CIFAR-10 dataset with only 5% globally shared data.
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 43representative citing papers
DIPBox is the first multi-scale testing framework for detecting adversarial dataset regeneration via four similarity metrics, backed by learning-theoretic analysis of utility-divergence trade-offs.
LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.
Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.
Unified convergence rates and tight lower bounds for Byzantine-robust distributed SGD under stochasticity and general data heterogeneity, showing local momentum reduces stochastic error floors.
FedSAP stabilizes federated prototype learning via a deterministic alignment curriculum and proxy separation loss, reporting up to 4 percentage point gains under high heterogeneity across three benchmarks.
A device-partitioning bandwidth allocation policy for federated learning over IIoT networks that provably reduces total training time compared to any non-partitioning scheme.
A single adversary in distributed training inflates its attribution value via latent optimization on synthetic batches without degrading accuracy or triggering basic defenses.
FedVSSAM mitigates flatness incompatibility in SAM-based federated learning by consistently using a variance-suppressed adjusted direction for local perturbation, descent, and global updates, with non-convex convergence guarantees.
ForgeVLA enables federated VLA model training from unlabeled vision-action pairs by recovering language via embodied classifiers and using contrastive planning plus adaptive aggregation to avoid feature collapse.
AW-PSP dynamically weights node sampling by real-time availability predictions and failure correlations to improve robustness, label coverage, and fairness in federated learning under correlated device failures.
HierFedCEA delivers a hierarchical federated learning framework for privacy-preserving climate control optimization across heterogeneous CEA facilities, reaching 94% of centralized performance with under 1 MB communication.
Conditioning a global FL model on local PCA statistics of client data matches oracle cluster performance across heterogeneous settings and is robust to sparse data with zero added communication.
Hybrid QFL cuts quantum transmissions from 3TNMP to {3t + 2(T-t)}NMP over T rounds while preserving near-centralized convergence and improving depolarizing-noise resilience via decentralized aggregation and Steane-code QEC.
Fed-Listing infers client label proportions in FedGNNs from final-layer gradients, outperforming baselines on four datasets and three architectures even in non-i.i.d. settings.
Fed-TaLoRA uses task-agnostic low-rank residual adaptation with post-aggregation calibration to enable efficient federated continual fine-tuning across sequential tasks under non-IID conditions.
SP-CACW is a convergence-aware client weighting scheme for selfish personalized federated learning that minimizes an upper bound on the target client's convergence error and can zero out harmful peers.
GEN-Guard detects generalization failures in federated surgical AI via client-blocked evaluation and corrects them via disagreement-aware distillation, yielding reported gains on in-federation, unseen-institution, and worst-case performance.
pFedCKKS derives CKKS parameter constraints for PFL under 128-bit security reducing choices to inner and outer ciphertext primes and evaluates precision-cost trade-offs on FEMNIST, CelebA and Sentiment140.
EvoCSFL combines candidate generation, a multi-objective metric, surrogate approximation, and evolutionary search to optimize client subsets in federated learning, reporting faster convergence and lower energy on image classification tasks.
FedMPT applies causal modeling and LLM-driven condition prompts with optimal transport and gating to perform federated multi-label prompt tuning of VLMs, claiming competitive results on benchmarks.
FIRMA introduces Fibonacci ring aggregation protocols for server-free federated learning that maintain private heads and achieve higher accuracy than FedAvg under label skew across multiple benchmarks and heterogeneity regimes.
Proactive client selection in federated learning via differentially private mutual information and simulated annealing to optimize Potential Federation Loss for utility and fairness.
FedSDR augments federated self-distillation with dual LoRA streams (local smoothing and global rectification) to produce globally aligned, factually faithful models under statistical heterogeneity.
citing papers explorer
-
DIPBox: A Multi-scale Testing Framework for Tracking Dataset Regeneration
DIPBox is the first multi-scale testing framework for detecting adversarial dataset regeneration via four similarity metrics, backed by learning-theoretic analysis of utility-divergence trade-offs.
-
Exploring CKKS Parameter Trade-offs for Privacy-Preserving Personalized Federated Learning
pFedCKKS derives CKKS parameter constraints for PFL under 128-bit security reducing choices to inner and outer ciphertext primes and evaluates precision-cost trade-offs on FEMNIST, CelebA and Sentiment140.
-
FedSurrogate: Backdoor Defense in Federated Learning via Layer Criticality and Surrogate Replacement
FedSurrogate defends federated learning against backdoors by clustering on security-critical layers and substituting malicious updates with benign surrogates, reporting false-positive rates below 10% and attack success below 2.1% under non-IID conditions.
-
AI-Native Closed-Loop Security for 6G-Enabled Cyber-Physical Systems: From Edge Detection to Network-Wide Mitigation
A survey of 128 studies under PRISMA 2020 that maps 6G CPS threats, unifies edge anomaly detection across datasets and models, synthesizes a closed-loop SDN/NFV/O-RAN architecture, and identifies open problems in data, latency, trust, standardization, and evaluation.
-
TITAN-FedAnil+: Trust-Based Adaptive Blockchain Federated Learning for Resource-Constrained Intelligent Enterprises
TITAN-FedAnil+ proposes trust-based adaptive clustered aggregation with blockchain resynchronization to improve robustness and reduce memory use in federated learning on constrained devices.
-
XAI-SOH-FL: Enhancing SOH-FL with Adaptive Aggregation and Explainable AI for Intrusion Detection in Heterogeneous IoT
XAI-SOH-FL extends SOH-FL with adaptive gamma via Bayesian optimization and SHAP interpretability, reporting 94.12% accuracy and 0.92 F1 on CICIDS2017 while converging faster than baseline.
-
From Data Heterogeneity to Convergence: A Data-Centric Review of Federated Learning
A data-centric survey of federated learning that ranks non-IID data traits by influence on convergence, links splitting protocols to real phenomena, and examines data-related defenses under clean and adversarial conditions.