Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification

Hang Qi; Matthew Brown; Tzu-Ming Harry Hsu

arxiv: 1909.06335 · v1 · pith:VZAZBDIC · submitted 2019-09-13 · cs.LG · cs.CV· stat.ML

Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification

Tzu-Ming Harry Hsu , Hang Qi , Matthew Brown This is my paper

Reviewed by Pith T0 review T1 audit T2 compute T3 formal T4 kernel 2026-05-17 17:31 UTCgrok-4.3pith:VZAZBDIC record.json open to challenge →

classification cs.LG cs.CVstat.ML

keywords federated learningnon-IID datadata heterogeneityfederated averagingserver momentumvisual classificationCIFAR-10

0 comments

The pith

Non-identical data distributions degrade federated averaging performance on visual tasks, but server momentum recovers most of the accuracy loss.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper investigates how differences in data distributions across devices affect federated learning for image classification. The authors create synthetic datasets that vary continuously in how non-identical the label distributions are among clients. They measure that the standard Federated Averaging algorithm loses accuracy as the distributions diverge, with the drop becoming severe in highly skewed cases. Adding momentum updates on the server side substantially improves results, raising accuracy from 30.1 percent to 76.9 percent in the most extreme non-identical setting on CIFAR-10.

Core claim

The central discovery is that performance of the Federated Averaging algorithm degrades as the non-identicalness of data distributions across clients increases, and that this degradation can be mitigated by incorporating server momentum, leading to improved classification accuracy on CIFAR-10 from 30.1% to 76.9% in the most skewed settings.

What carries the argument

A method to synthesize datasets with a continuous range of identicalness, used to quantify the impact on Federated Averaging and to test the server momentum mitigation strategy.

If this is right

Accuracy of federated visual classification declines steadily with increasing differences in client data distributions.
Server momentum provides consistent gains over the full range of non-identicalness tested.
The largest gains occur in the most skewed distribution settings, where baseline accuracy is lowest.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar momentum-based corrections might help federated learning on other data modalities or tasks beyond image classification.
Real deployments could benefit from monitoring distribution divergence to decide when to apply such mitigations.
Extending the synthesis method to other forms of heterogeneity, such as feature distribution shifts, would provide a fuller picture.

Load-bearing premise

The synthetic datasets with controlled label distribution differences accurately represent the non-identical data found on real mobile devices.

What would settle it

Repeating the experiments using actual image data collected from a large number of mobile users and checking whether the accuracy degradation and recovery with momentum match the synthetic results.

read the original abstract

Federated Learning enables visual models to be trained in a privacy-preserving way using real-world data from mobile devices. Given their distributed nature, the statistics of the data across these devices is likely to differ significantly. In this work, we look at the effect such non-identical data distributions has on visual classification via Federated Learning. We propose a way to synthesize datasets with a continuous range of identicalness and provide performance measures for the Federated Averaging algorithm. We show that performance degrades as distributions differ more, and propose a mitigation strategy via server momentum. Experiments on CIFAR-10 demonstrate improved classification performance over a range of non-identicalness, with classification accuracy improved from 30.1% to 76.9% in the most skewed settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper introduces a continuous synthesis method for non-IID federated vision data on CIFAR-10 and shows server momentum recovers most of the accuracy lost to skew.

read the letter

Hi, the main thing to know is that this work gives a controllable way to create datasets with varying degrees of non-identicalness and then measures how FedAvg suffers while server momentum helps a lot. They report a jump from 30.1% to 76.9% accuracy in the worst skew case, which is a concrete number worth noting. What is new is the synthesis procedure that produces a continuous range instead of a handful of fixed partitions. That setup lets them plot a degradation curve across levels of identicalness, which is more useful than the usual discrete comparisons. The server-momentum fix is straightforward and the lift looks real within their experiments. The paper does well at keeping the evaluation focused on standard CIFAR-10 with clear reporting of the main accuracy trends. It stays empirical and avoids overclaiming theory. The soft spot is the synthesis itself. It appears to center on label distribution skew, which is important but leaves out feature-level shifts, camera artifacts, or per-device quantity differences that show up in actual mobile visual data. If those factors matter more in practice, the reported gains could be narrower than they seem. The stress-test concern lands here: the results are tied to this particular construction, so readers should treat the mitigation as promising but not yet proven for real-device heterogeneity. Within the paper's own frame the argument is consistent and the numbers are reproducible from the described setup. No load-bearing circularity or post-hoc fitting jumps out. This is useful for people running federated vision experiments who need a benchmark for non-IID effects or a simple tweak to try. It is not a new framework but it adds measurable evidence that others can reference. I would send it for peer review so referees can examine the synthesis details and ask for more validation against real distributions.

Referee Report

2 major / 2 minor

Summary. The manuscript empirically studies the impact of non-identical data distributions on federated visual classification using Federated Averaging. It introduces a synthesis procedure to generate CIFAR-10 partitions with a controllable, continuous spectrum of statistical heterogeneity, demonstrates performance degradation as identicalness decreases, and proposes server momentum as a mitigation that raises accuracy from 30.1% to 76.9% in the most skewed regime.

Significance. If the synthesis procedure and reported gains hold under scrutiny, the work supplies concrete, quantitative evidence on how label-distribution skew affects federated training and offers a simple, practical mitigation. The continuous control parameter enables systematic measurement rather than binary IID/non-IID comparisons, which is useful for the federated-learning community.

major comments (2)

[§3] §3 (Dataset Synthesis): the procedure for modulating the non-identicalness control parameter is described at a high level but does not explicitly state whether it alters only label marginals or also induces feature-level shifts, quantity imbalance, or client-specific imaging artifacts; this distinction is load-bearing for the claim that the observed degradation curve and momentum gain generalize beyond the synthetic construction.
[§4] §4 (Experiments): the headline numbers (30.1% to 76.9%) are presented without reported standard deviations across random seeds or client-sampling runs, and without an ablation confirming that the momentum hyper-parameter was not tuned post-hoc on the same skewed partitions used for the final claim.

minor comments (2)

[Figures] Figure 2 (or equivalent accuracy-vs-skew plot): axis labels and legend entries should explicitly name the non-identicalness control parameter values corresponding to each curve.
[Related Work] Related-work section: the discussion of prior federated-learning heterogeneity papers is brief; adding one or two sentences contrasting the continuous synthesis approach with discrete Dirichlet or pathological partitioning methods would improve context.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and positive recommendation for minor revision. Below we respond to each major comment and describe the changes we will incorporate in the revised manuscript.

read point-by-point responses

Referee: [§3] §3 (Dataset Synthesis): the procedure for modulating the non-identicalness control parameter is described at a high level but does not explicitly state whether it alters only label marginals or also induces feature-level shifts, quantity imbalance, or client-specific imaging artifacts; this distinction is load-bearing for the claim that the observed degradation curve and momentum gain generalize beyond the synthetic construction.

Authors: We clarify that our synthesis procedure controls the degree of label distribution skew across clients by drawing from a Dirichlet distribution parameterized by alpha, while the underlying images and their features remain unchanged from the original CIFAR-10 dataset. No feature-level shifts, client-specific artifacts, or quantity imbalances are introduced; all clients are assigned the same number of examples. This is a standard approach for studying label skew in federated learning. We have revised the description in Section 3 to explicitly detail these aspects, allowing readers to better assess the generalizability of our findings to other forms of heterogeneity. revision: yes
Referee: [§4] §4 (Experiments): the headline numbers (30.1% to 76.9%) are presented without reported standard deviations across random seeds or client-sampling runs, and without an ablation confirming that the momentum hyper-parameter was not tuned post-hoc on the same skewed partitions used for the final claim.

Authors: We agree that reporting variability is important. In the revised version, we include standard deviations over multiple random seeds and client sampling runs for the reported accuracies. For the server momentum, the hyper-parameter was chosen based on a grid search performed on a separate set of experiments with moderate skew levels, not tuned specifically on the most skewed partitions for the headline result. We have added an ablation table showing the effect of different momentum values across the spectrum of non-identicalness to demonstrate that the gains are robust and not due to post-hoc selection. revision: yes

Circularity Check

0 steps flagged

No circularity detected; purely empirical evaluation of FedAvg on synthetic non-IID CIFAR-10 partitions

full rationale

The manuscript contains no derivation chain, uniqueness theorems, or fitted-parameter predictions. It defines a label-skew synthesis procedure, runs Federated Averaging experiments across a range of skew levels, and reports measured accuracy (30.1 % to 76.9 %). All reported quantities are direct experimental outputs on held-out test data; none are obtained by algebraic substitution of quantities defined inside the same paper or by self-citation that is itself unverified. The work is therefore self-contained against external benchmarks such as standard CIFAR-10 classification accuracy.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

Empirical measurement paper; relies on standard machine-learning assumptions about optimization and generalization but introduces no new theoretical axioms or postulated entities.

free parameters (1)

non-identicalness control parameter
Continuous scalar used to generate datasets with varying degrees of statistical difference across simulated clients.

pith-pipeline@v0.9.0 · 5428 in / 1177 out tokens · 99938 ms · 2026-05-17T17:31:04.648279+00:00 · methodology

discussion (0)

Forward citations

Cited by 44 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

From Efficiency to Leakage -- Privacy Backdoor in Federated Language Model Fine-Tuning
cs.CR 2026-06 conditional novelty 7.0

NeuroImprint attack assigns isolated memorization neurons to training samples in PEFT adapters, enabling closed-form reconstruction of 59-79% of samples across BERT, GPT-2, Qwen2, and Llama3.2 on multiple datasets.
Quantifying and Defending against the Privacy Risk in Logit-based Federated Learning
cs.CR 2026-06 unverdicted novelty 7.0

Logit-based federated learning leaks private model information to a semi-honest server via shared logits even with unrelated public data, enabling an adaptive stealing attack with theoretical bounds and a logit-pertur...
Fairness-Aware Federated Learning with Trajectory Shapley Value
cs.LG 2026-05 unverdicted novelty 7.0

FedTSV introduces Trajectory Shapley Value to dynamically weight client updates in federated learning based on their impact on the optimization trajectory for better fairness and stability.
Federated Martingale Posterior Samping
cs.LG 2026-05 unverdicted novelty 7.0

Federated martingale posterior sampling lets clients share data embeddings for central predictive Bayesian sampling, matching centralized performance and improving calibration on MNIST, CIFAR-10, and CIFAR-100.
When More Parameters Hurt: Foundation Model Priors Amplify Worst-Client Disparity Under Extreme Federated Heterogeneity
cs.LG 2026-05 unverdicted novelty 7.0

Foundation model priors amplify worst-client disparity under extreme federated heterogeneity, creating a fairness paradox where larger models perform worse for disadvantaged clients.
FedGUI: Benchmarking Federated GUI Agents across Heterogeneous Platforms, Devices, and Operating Systems
cs.MA 2026-04 unverdicted novelty 7.0

FedGUI is the first comprehensive benchmark for federated GUI agents that studies cross-platform, cross-device, cross-OS, and cross-source heterogeneity, with experiments showing performance gains from cross-platform ...
FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning
cs.LG 2026-03 unverdicted novelty 7.0

FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.
DP-FedAdamW: An Efficient Optimizer for Differentially Private Federated Large Models
cs.LG 2026-02 unverdicted novelty 7.0

DP-FedAdamW delivers an unbiased second-moment estimator for AdamW in DPFL, proving linear convergence acceleration without heterogeneity assumptions and outperforming SOTA by 5.83% on Tiny-ImageNet with Swin-Base at ε=1.
Random Walk Learning and the Pac-Man Attack
stat.ML 2025-08 unverdicted novelty 7.0

Introduces Pac-Man attack on random walks in distributed learning and Average Crossing duplication to ensure survival and convergence of SGD.
On the Surprising Effectiveness of a Single Global Merging in Decentralized Learning
cs.LG 2025-07 unverdicted novelty 7.0

A single global merge at the final step of decentralized SGD matches the convergence rate of parallel SGD while improving test accuracy under high data heterogeneity.
Class-Grouped Normalized Momentum and Faster Hyperparameter Exploration to Tackle Class Imbalance in Federated Learning
cs.LG 2026-07 unverdicted novelty 6.0

FedCGNM uses class-grouped normalized momentum to equalize gradients across imbalanced classes in FL with convergence analysis, plus FedHOO X-armed-bandit method for efficient resampling rate tuning.
Accurate and Resource-Efficient Federated Continual Learning
cs.LG 2026-06 unverdicted novelty 6.0

FedRAN achieves up to 4.8 pp higher accuracy in federated continual learning while using 30-122× less per-client communication by transmitting truncated-SVD summaries of random-feature Gram matrices and performing clo...
Demystifying the Optimal Fair Classifier in Multi-Class Classification
cs.LG 2026-05 unverdicted novelty 6.0

Derives tractable optimal fair multi-class classifier and supplies in-processing and post-processing algorithms that converge to the accuracy-fairness Pareto frontier.
S$^3$LDBO: A Snapshot Single-Loop Algorithm for Decentralized Bilevel Optimization
math.OC 2026-05 unverdicted novelty 6.0

S³LDBO is a snapshot single-loop algorithm for decentralized bilevel optimization that reduces computational cost via intermittent derivative skipping and provides ergodic and high-probability nonergodic iteration com...
FedSmoothLoRA: Toward Smoother and Faster Convergence in Federated Low-Rank Adaptation
cs.CV 2026-05 unverdicted novelty 6.0

FedSmoothLoRA improves federated LoRA fine-tuning by constructing local initializations from a round-matching matrix for cross-round continuity and a gradient-aligned matrix for client-specific guidance, yielding fast...
OmniISR: A Unified Framework for Centralized and Federated Learning via Intermediate Supervision and Regularization
cs.LG 2026-05 unverdicted novelty 6.0

OmniISR unifies centralized, federated, and hybrid learning by injecting mutual-information supervision and negative-entropy regularization at multiple hidden layers, with supporting convergence and drift bounds.
UB-SMoE: Universally Balanced Sparse Mixture-of-Experts for Resource-adaptive Federated Fine-tuning of Foundation Models
cs.LG 2026-05 unverdicted novelty 6.0

UB-SMoE balances expert utilization in heterogeneous federated SMoE fine-tuning via Dynamic Modulated Routing and Universal Pseudo-Gradient, delivering up to 45% compute reduction and 8.7x performance gains for low-re...
Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity
cs.LG 2026-05 unverdicted novelty 6.0

Rescaled ASGD recovers convergence to the true global objective by rescaling worker stepsizes proportional to computation times, matching the known time lower bound in the leading term under non-convex smoothness and ...
FedVSSAM: Mitigating Flatness Incompatibility in Sharpness-Aware Federated Learning
cs.LG 2026-05 unverdicted novelty 6.0

FedVSSAM mitigates flatness incompatibility in SAM-based federated learning by consistently using a variance-suppressed adjusted direction for local perturbation, descent, and global updates, with non-convex convergen...
PRISM: Exposing and Resolving Spurious Isolation in Federated Multimodal Continual Learning
cs.MM 2026-05 unverdicted novelty 6.0

PRISM maintains per-expert gradient subspace bases preserved under FedAvg to resolve spurious isolation in federated multimodal continual learning, outperforming 16 baselines with larger gains on longer task sequences.
Federated Cross-Modal Retrieval with Missing Modalities via Semantic Routing and Adapter Personalization
cs.CV 2026-04 unverdicted novelty 6.0

RCSR is a personalization-friendly federated framework that improves cross-modal retrieval accuracy and stability under missing modalities via semantic routing and adapters.
SecureGate: Learning When to Reveal PII Safely via Token-Gated Dual-Adapters for Federated LLMs
cs.CR 2026-02 unverdicted novelty 6.0

SecureGate reduces PII leakage up to 31.66X in federated LLM fine-tuning via token-gated dual LoRA adapters while preserving utility and achieving perfect routing reliability.
DeepFedNAS: Efficient Hardware-Aware Architecture Adaptation for Heterogeneous IoT Federations via Pareto-Guided Supernet Training
cs.LG 2026-01 unverdicted novelty 6.0

DeepFedNAS delivers up to 1.21% higher accuracy and 61x faster architecture search for federated learning on heterogeneous IoT by replacing random supernet sampling with Pareto-optimal elite architectures and using a ...
DFedReweighting: A Unified Framework for Objective-Oriented Reweighting in Decentralized Federated Learning
cs.LG 2025-12 unverdicted novelty 6.0

DFedReweighting is a unified reweighting method for decentralized federated learning that customizes aggregation via target metrics and strategies to improve fairness, Byzantine robustness, and other objectives while ...
Adaptive Federated Optimization
cs.LG 2020-02 unverdicted novelty 6.0

Proposes federated adaptive optimizers (FedAdagrad, FedAdam, FedYogi) with convergence analysis for non-convex objectives under data heterogeneity and reports empirical gains over FedAvg.
JiRAIYA: A Reputation-Based Hierarchical Federated Learning Framework on Web3
cs.DC 2026-06 unverdicted novelty 5.0

JiRAIYA introduces a hierarchical FL framework on Web3 using delegated managers, novelty detection, consensus, and reputation scores to improve transparency and attack resilience.
FIRMA: FIbonacci Ring Model Aggregation for Privacy-preserving Federated Learning
cs.LG 2026-05 unverdicted novelty 5.0

FIRMA introduces Fibonacci ring aggregation protocols for server-free federated learning that maintain private heads and achieve higher accuracy than FedAvg under label skew across multiple benchmarks and heterogeneit...
CRAFT: Conflict-Resolved Aggregation for Federated Training
cs.LG 2026-05 unverdicted novelty 5.0

CRAFT derives a closed-form solution for conflict-resolved aggregation in federated learning via geometric constraints and projection, with theoretical support for common descent and empirical gains on heterogeneous data.
FedSDR: Federated Self-Distillation with Rectification
cs.LG 2026-05 unverdicted novelty 5.0

FedSDR augments federated self-distillation with dual LoRA streams (local smoothing and global rectification) to produce globally aligned, factually faithful models under statistical heterogeneity.
Asynchronous Federated Unlearning with Invariance Calibration for Medical Imaging
cs.LG 2026-04 unverdicted novelty 5.0

AFU-IC decouples client unlearning from global federated training in medical imaging and adds server-side invariance calibration to prevent relearning of erased data.
PubSwap: Public-Data Off-Policy Coordination for Federated RLVR
cs.LG 2026-04 unverdicted novelty 5.0

PubSwap uses a small public dataset for selective off-policy response swapping in federated RLVR to improve coordination and performance over standard baselines on math and medical reasoning tasks.
REVERB-FL: Server-Side Adversarial and Reserve-Enhanced Federated Learning for Robust Audio Classification
eess.AS 2025-12 unverdicted novelty 5.0

REVERB-FL uses a server-side reserve set with retraining and adversarial training to reduce poisoning effects and speed convergence in federated audio classification under non-IID data.
FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher
cs.LG 2024-08 unverdicted novelty 5.0

FedQUIT performs on-device unlearning in federated learning by distilling from a virtual teacher that penalizes true-class confidence on forget data while preserving other output relationships, matching or exceeding p...
Understanding the Robustness of Distributed Self-Supervised Learning Frameworks Against Non-IID Data
cs.LG 2026-07 unverdicted novelty 4.0

Abstract-only report: theoretical comparison finds MIM more robust than CL to non-IID data in D-SSL and robustness scales with connectivity; MAR loss proposed as practical application.
Benchmarking Federated Learning and Knowledge Distillation for Point Cloud Classification
cs.GR 2026-06 unverdicted novelty 4.0

Benchmark of federated learning plus knowledge distillation for point cloud classification reveals that label-free distillation objectives are required for student accuracy to reflect federated teacher quality rather ...
Multi-Level Analyzation of Imbalance to Resolve Non-IID-Ness in Federated Learning
cs.LG 2026-06 unverdicted novelty 4.0

FedBB addresses inter-case, inter-class, and inter-client imbalances in federated learning via Positive Negative Balanced loss and Client Balanced Reweighting, outperforming baselines on X-ray and natural image datase...
GuidaPA: Privacy-Preserving Chatbot for Public Administration via Federated Learning
cs.AI 2026-05 unverdicted novelty 4.0

Federated QLoRA fine-tuning on distributed PA manuals from SIGESON and SIDFORS yields ROUGE-1/2/L of 61.10/55.77/59.44 and BLEU-4 of 45.02, close to centralized training.
Rethinking the Personalized Relaxed Initialization in the Federated Learning: Consistency and Generalization
cs.LG 2026-04 unverdicted novelty 4.0

FedInit uses reverse personalized initialization in FL to reduce client drift effects, showing via excess risk that inconsistency impacts generalization error more than optimization error.
FedNSAM:Consistency of Local and Global Flatness for Federated Learning
cs.LG 2026-02 unverdicted novelty 4.0

FedNSAM uses global Nesterov momentum to make local flatness consistent with global flatness in federated learning, yielding tighter convergence than FedSAM and better empirical performance.
Multi-Worker Selection based Distributed Swarm Learning for Edge IoT with Non-i.i.d. Data
cs.LG 2025-09 unverdicted novelty 4.0

Introduces M-DSL algorithm for distributed swarm learning that selects workers using a new non-i.i.d. degree metric to improve convergence and accuracy under data heterogeneity, with theoretical analysis and experimen...
DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning
cs.LG 2024-11 unverdicted novelty 4.0

DeTrigger detects and mitigates backdoor attacks in federated learning via gradient analysis and temperature scaling, claiming up to 251x faster detection and 98.9% attack reduction on four datasets with minimal accur...
From Data Heterogeneity to Convergence: A Data-Centric Review of Federated Learning
cs.CR 2026-06 unverdicted novelty 3.0

A data-centric survey of federated learning that ranks non-IID data traits by influence on convergence, links splitting protocols to real phenomena, and examines data-related defenses under clean and adversarial conditions.
A Comparative Study of Federated Learning Aggregation Strategies under Homogeneous and Heterogeneous Data Distributions
cs.LG 2026-05 unverdicted novelty 2.0

Federated aggregation strategies show distinct performance trade-offs in accuracy, loss, and efficiency depending on whether client data distributions are homogeneous or heterogeneous.
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey
cs.CR 2024-09 unverdicted novelty 2.0

Survey of harmful fine-tuning attacks on LLMs, their variants, defense strategies, mechanical analysis, and evaluation methodologies.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · cited by 44 Pith papers · 4 internal anchors

[3]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009

work page 2009
[4]

On the convergence of fedavg on non-iid data,

Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. On the convergence of FedAvg on non- IID data. arXiv preprint arXiv:1907.02189, 2019

work page arXiv 1907
[5]

Communication-efficient learning of deep networks from decentralized data

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pages 1273--1282, 2017

work page 2017
[6]

Gradient methods for minimizing composite objective function

Yu Nesterov. Gradient methods for minimizing composite objective function. 2007

work page 2007
[10]

Advanced convolutional neural networks

TensorFlow. Advanced convolutional neural networks. URL https://www.tensorflow.org/tutorials/images/deep_cnn

work page
[11]

Bayesian nonparametric federated learning of neural networks

Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Nghia Hoang, and Yasaman Khazaeni. Bayesian nonparametric federated learning of neural networks. In International Conference on Machine Learning, pages 7252--7261, 2019

work page 2019
[12]

Federated Learning with Non-IID Data

Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. Federated learning with non- IID data. arXiv preprint arXiv:1806.00582, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[13]

2009 , institution=

Learning multiple layers of features from tiny images , author=. 2009 , institution=

work page 2009
[14]

Federated learning with non-

Zhao, Yue and Li, Meng and Lai, Liangzhen and Suda, Naveen and Civin, Damon and Chandra, Vikas , journal=. Federated learning with non-

work page
[15]

Robust and Communication-Efficient Federated Learning from Non-IID Data

Sattler, Felix and Wiedemann, Simon and M. Robust and communication-efficient federated learning from non-. arXiv preprint arXiv:1903.02891 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1903
[16]

Artificial Intelligence and Statistics , pages=

Communication-Efficient Learning of Deep Networks from Decentralized Data , author=. Artificial Intelligence and Statistics , pages=

work page
[17]

LEAF: A Benchmark for Federated Settings.arXiv preprint arXiv:1812.01097, 2018

Leaf: A benchmark for federated settings , author=. arXiv preprint arXiv:1812.01097 , year=

work page arXiv
[18]

International Conference on Machine Learning , pages=

Semi-Cyclic Stochastic Gradient Descent , author=. International Conference on Machine Learning , pages=

work page
[19]

International Conference on Machine Learning , pages=

Agnostic Federated Learning , author=. International Conference on Machine Learning , pages=

work page
[20]

Gradient methods for minimizing composite objective function , author=

work page
[21]

Measuring the Effects of Data Parallelism on Neural Network Training

Measuring the effects of data parallelism on neural network training , author=. arXiv preprint arXiv:1811.03600 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[22]

arXiv preprint arXiv:1812.06127 , year=

On the convergence of federated optimization in heterogeneous networks , author=. arXiv preprint arXiv:1812.06127 , year=

work page arXiv
[23]

On the Convergence of

Li, Xiang and Huang, Kaixuan and Yang, Wenhao and Wang, Shusen and Zhang, Zhihua , journal=. On the Convergence of

work page
[24]

International Conference on Machine Learning , pages=

Bayesian Nonparametric Federated Learning of Neural Networks , author=. International Conference on Machine Learning , pages=

work page
[25]

EMNIST: an extension of MNIST to handwritten letters

Cohen, Gregory and Afshar, Saeed and Tapson, Jonathan and van Schaik, Andr. arXiv preprint arXiv:1702.05373 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[1] [3]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009

work page 2009

[2] [4]

On the convergence of fedavg on non-iid data,

Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. On the convergence of FedAvg on non- IID data. arXiv preprint arXiv:1907.02189, 2019

work page arXiv 1907

[3] [5]

Communication-efficient learning of deep networks from decentralized data

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pages 1273--1282, 2017

work page 2017

[4] [6]

Gradient methods for minimizing composite objective function

Yu Nesterov. Gradient methods for minimizing composite objective function. 2007

work page 2007

[5] [10]

Advanced convolutional neural networks

TensorFlow. Advanced convolutional neural networks. URL https://www.tensorflow.org/tutorials/images/deep_cnn

work page

[6] [11]

Bayesian nonparametric federated learning of neural networks

Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Nghia Hoang, and Yasaman Khazaeni. Bayesian nonparametric federated learning of neural networks. In International Conference on Machine Learning, pages 7252--7261, 2019

work page 2019

[7] [12]

Federated Learning with Non-IID Data

Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. Federated learning with non- IID data. arXiv preprint arXiv:1806.00582, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[8] [13]

2009 , institution=

Learning multiple layers of features from tiny images , author=. 2009 , institution=

work page 2009

[9] [14]

Federated learning with non-

Zhao, Yue and Li, Meng and Lai, Liangzhen and Suda, Naveen and Civin, Damon and Chandra, Vikas , journal=. Federated learning with non-

work page

[10] [15]

Robust and Communication-Efficient Federated Learning from Non-IID Data

Sattler, Felix and Wiedemann, Simon and M. Robust and communication-efficient federated learning from non-. arXiv preprint arXiv:1903.02891 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1903

[11] [16]

Artificial Intelligence and Statistics , pages=

Communication-Efficient Learning of Deep Networks from Decentralized Data , author=. Artificial Intelligence and Statistics , pages=

work page

[12] [17]

LEAF: A Benchmark for Federated Settings.arXiv preprint arXiv:1812.01097, 2018

Leaf: A benchmark for federated settings , author=. arXiv preprint arXiv:1812.01097 , year=

work page arXiv

[13] [18]

International Conference on Machine Learning , pages=

Semi-Cyclic Stochastic Gradient Descent , author=. International Conference on Machine Learning , pages=

work page

[14] [19]

International Conference on Machine Learning , pages=

Agnostic Federated Learning , author=. International Conference on Machine Learning , pages=

work page

[15] [20]

Gradient methods for minimizing composite objective function , author=

work page

[16] [21]

Measuring the Effects of Data Parallelism on Neural Network Training

Measuring the effects of data parallelism on neural network training , author=. arXiv preprint arXiv:1811.03600 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[17] [22]

arXiv preprint arXiv:1812.06127 , year=

On the convergence of federated optimization in heterogeneous networks , author=. arXiv preprint arXiv:1812.06127 , year=

work page arXiv

[18] [23]

On the Convergence of

Li, Xiang and Huang, Kaixuan and Yang, Wenhao and Wang, Shusen and Zhang, Zhihua , journal=. On the Convergence of

work page

[19] [24]

International Conference on Machine Learning , pages=

Bayesian Nonparametric Federated Learning of Neural Networks , author=. International Conference on Machine Learning , pages=

work page

[20] [25]

EMNIST: an extension of MNIST to handwritten letters

Cohen, Gregory and Afshar, Saeed and Tapson, Jonathan and van Schaik, Andr. arXiv preprint arXiv:1702.05373 , year=

work page internal anchor Pith review Pith/arXiv arXiv