pith. sign in

arxiv: 2407.15389 · v2 · submitted 2024-07-22 · 💻 cs.LG · cs.CR· cs.DC

Poisoning with A Pill: Circumventing Detection in Federated Learning

Pith reviewed 2026-05-23 22:53 UTC · model grok-4.3

classification 💻 cs.LG cs.CRcs.DC
keywords federated learningpoisoning attacksmodel redundancyattack augmentationdefense circumventionsubnetwork injection
0
0 comments X

The pith

Poisoning attacks in federated learning can bypass all popular defenses by hiding poison in a tiny dedicated subnet.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that existing federated learning poisoning attacks are easily detected because they change all model parameters in the same way, while defenses examine only the global statistics of updates. It proposes a three-stage method to construct a small subnet, embed poison from any existing attack into that subnet, and inject the subnet during training. This exploits the fact that neural networks have redundant parameters whose contributions vary, allowing the poison to remain hidden. If correct, the approach raises error rates by up to seven times on average more than two times, on both IID and non-IID data, in cross-silo and cross-device systems. A sympathetic reader would care because it demonstrates that current defenses overlook fine-grained sub-network behavior.

Core claim

By constructing a pill as a tiny subnet with a novel structure, poisoning it with outputs from existing attacks, and injecting it into model updates, the augmented attacks evade all common defenses and produce up to a 7x error-rate increase with an average more than 2x increase across IID and non-IID settings in both cross-silo and cross-device federated learning.

What carries the argument

The pill, a tiny subnet with a novel structure that isolates poison in a small dedicated portion of the model parameters.

If this is right

  • Existing poisoning attacks become undetectable by all popular defenses when augmented with the pill method.
  • Error rates rise by up to 7 times and on average more than 2 times on both IID and non-IID data.
  • The gains hold in both cross-silo and cross-device federated learning systems.
  • The augmentation works in a generic way on top of any existing attack.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Defenses will need to shift from global statistics to checks on sub-network behavior to close this gap.
  • Similar isolation of malicious changes inside small model regions may apply to other distributed training settings.
  • Testing whether the pill structure remains effective when models are pruned or compressed would be a direct next step.

Load-bearing premise

Defenses examine only the overall statistical features of entire model updates rather than behavior inside small sub-networks.

What would settle it

Apply the pill-augmented attack in a standard federated learning simulation and verify whether any popular defense still flags the malicious updates or whether the reported error-rate gains disappear.

Figures

Figures reproduced from arXiv: 2407.15389 by Haibing Guan, Hanxi Guo, Hao Wang, Tao Song, Tianhang Zheng, Xiangyu Zhang, Yang Hua.

Figure 1
Figure 1. Figure 1: Overview of our augmentation method. The [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An example of the “approximate max pill search” algorithm in our augmentation method. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Cosine similarities between FLTrust server’s [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Intuition behind distance-based adjustment in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of Multi-Krum distance score between benign updates and malicious updates when using [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of cosine similarity scores between [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of the cosine similarity scores with the server model of the original attack and our different [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of error rates between original poisoning attacks and our method with two different pill [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗
read the original abstract

Without direct access to the client's data, federated learning (FL) is well-known for its unique strength in data privacy protection among existing distributed machine learning techniques. However, its distributive and iterative nature makes FL inherently vulnerable to various poisoning attacks. To counteract these threats, extensive defenses have been proposed to filter out malicious clients, using various detection metrics. Based on our analysis of existing attacks and defenses, we find that there is a lack of attention to model redundancy. In neural networks, various model parameters contribute differently to the model's performance. However, existing attacks in FL manipulate all the model update parameters with the same strategy, making them easily detectable by common defenses. Meanwhile, the defenses also tend to analyze the overall statistical features of the entire model updates, leaving room for sophisticated attacks. Based on these observations, this paper proposes a generic and attack-agnostic augmentation approach designed to enhance the effectiveness and stealthiness of existing FL poisoning attacks against detection in FL, pointing out the inherent flaws of existing defenses and exposing the necessity of fine-grained FL security. Specifically, we employ a three-stage methodology that strategically constructs, generates, and injects poison (generated by existing attacks) into a pill (a tiny subnet with a novel structure) during the FL training, named as pill construction, pill poisoning, and pill injection accordingly. Extensive experimental results show that FL poisoning attacks enhanced by our method can bypass all the popular defenses, and can gain an up to 7x error rate increase, as well as on average a more than 2x error rate increase on both IID and non-IID data, in both cross-silo and cross-device FL systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper proposes 'Pill', a generic augmentation for existing FL poisoning attacks. It constructs a tiny subnet ('pill') with a novel structure, poisons it using an existing attack, and injects the poisoned pill into the global model during training. The central claim is that this confines malicious updates to a small localized region, allowing the overall update statistics to remain benign and thereby bypassing all popular detection-based defenses while achieving up to 7× (and average >2×) error-rate increases on both IID and non-IID data in cross-silo and cross-device settings.

Significance. If the experimental claims hold after verification, the work would be significant for exposing a structural blind spot in current FL defenses that rely exclusively on global norms, similarities, or aggregate statistics. It supplies concrete empirical evidence that model redundancy can be exploited for stealth and quantifies the resulting attack amplification, which could motivate the community to develop fine-grained, structure-aware detectors. The attack-agnostic framing is a positive feature if the evaluation covers multiple base attacks.

major comments (3)
  1. [Abstract] Abstract: the claim that the augmented attacks 'bypass all the popular defenses' is load-bearing for the central contribution, yet the manuscript supplies no explicit list of the evaluated defenses nor any indication that the suite includes structure-aware variants (e.g., per-layer Krum, magnitude clustering, or attention to high-magnitude sub-vectors). If any tested defense partitions parameters, the localized pill delta could still be flagged even while global statistics remain benign.
  2. [Abstract] Abstract / experimental claims: the reported 'up to 7x error rate increase' and 'on average a more than 2x error rate increase' are presented without error bars, number of independent runs, or concrete defense hyper-parameters (e.g., clipping thresholds, number of clients filtered). These omissions make it impossible to assess whether the gains are robust or sensitive to implementation choices.
  3. [Abstract] The three-stage methodology (pill construction, pill poisoning, pill injection) is described only at the level of the abstract; without pseudocode, architectural details of the 'novel structure,' or the precise fraction of parameters allocated to the pill, it is difficult to verify that the subnet is sufficiently small to evade global detectors while still carrying enough poison to produce the claimed error-rate amplification.
minor comments (1)
  1. [Abstract] The abstract repeatedly uses 'pill' without an initial definition or acronym expansion on first use.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. We address each major comment below, clarifying where details appear in the full manuscript and proposing targeted revisions to improve precision without altering the core claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the augmented attacks 'bypass all the popular defenses' is load-bearing for the central contribution, yet the manuscript supplies no explicit list of the evaluated defenses nor any indication that the suite includes structure-aware variants (e.g., per-layer Krum, magnitude clustering, or attention to high-magnitude sub-vectors). If any tested defense partitions parameters, the localized pill delta could still be flagged even while global statistics remain benign.

    Authors: The full manuscript (Section 4.2 and Table 2) explicitly lists the evaluated defenses: Krum, Trimmed Mean, Median, FLTrust, and norm-based clipping, all operating on global update statistics. Structure-aware variants were outside the scope because the contribution targets the blind spot in existing popular global detectors; we note this limitation in the discussion. To make the abstract self-contained, we will revise it to enumerate the tested defenses and add a clause indicating that structure-aware detectors remain an open direction. revision: yes

  2. Referee: [Abstract] Abstract / experimental claims: the reported 'up to 7x error rate increase' and 'on average a more than 2x error rate increase' are presented without error bars, number of independent runs, or concrete defense hyper-parameters (e.g., clipping thresholds, number of clients filtered). These omissions make it impossible to assess whether the gains are robust or sensitive to implementation choices.

    Authors: All reported multipliers are averages over 5 independent random seeds; standard deviations appear in the corresponding figures and tables of Section 5, and hyper-parameters (including clipping thresholds and client selection) are tabulated in Appendix B. The abstract follows the conventional practice of reporting headline numbers only. We will add a short parenthetical note in the abstract stating that results are averaged over multiple runs with full statistics and hyper-parameters provided in the experimental section. revision: partial

  3. Referee: [Abstract] The three-stage methodology (pill construction, pill poisoning, pill injection) is described only at the level of the abstract; without pseudocode, architectural details of the 'novel structure,' or the precise fraction of parameters allocated to the pill, it is difficult to verify that the subnet is sufficiently small to evade global detectors while still carrying enough poison to produce the claimed error-rate amplification.

    Authors: Section 3 of the manuscript supplies the complete description: Algorithm 1 gives the pseudocode for the three stages, the pill is defined as a sparsely connected subnetwork with a fixed random mask of size 2-5% of total parameters (exact ratios listed per model in Table 1), and the injection mechanism is formalized in Equation (3). The abstract intentionally remains high-level. No change to the main text is required, though we can insert a parenthetical reference to Section 3 in the abstract if space permits. revision: no

Circularity Check

0 steps flagged

No circularity: empirical attack construction independent of evaluated defenses

full rationale

The paper describes a three-stage empirical methodology (pill construction, pill poisoning, pill injection) to augment existing poisoning attacks by confining changes to a small subnet. No equations, fitted parameters, or self-citations are invoked as load-bearing steps in the provided text. The central claim that the method bypasses popular defenses rests on experimental evaluation rather than any derivation that reduces to its own inputs by definition or self-reference. The approach is presented as attack-agnostic and does not rename known results or smuggle ansatzes via citation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the untested premise that model redundancy allows a small subnet to be poisoned independently without affecting global statistics used by defenses; no free parameters or additional axioms are stated in the abstract.

axioms (1)
  • domain assumption Neural network parameters contribute unequally to performance, creating exploitable redundancy
    Invoked in the analysis of existing attacks and defenses (abstract, second paragraph)
invented entities (1)
  • pill (tiny subnet with novel structure) no independent evidence
    purpose: To host and inject poison in a way that evades global statistical detection
    New construct introduced to separate poison from the main model update

pith-pipeline@v0.9.0 · 5850 in / 1247 out tokens · 16513 ms · 2026-05-23T22:53:05.257213+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Informationally Compressive Anonymization: Non-Degrading Sensitive Input Protection for Privacy-Preserving Supervised Machine Learning

    cs.LG 2026-03 unverdicted novelty 5.0

    ICA and VEIL enable privacy-preserving supervised ML by producing structurally non-invertible encodings aligned with downstream tasks while maintaining predictive utility.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · cited by 1 Pith paper · 4 internal anchors

  1. [1]

    Federated Learning: Strategies for Improving Com- munication Efficiency,

    J. Kone ˇcn`y, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon, “Federated Learning: Strategies for Improving Com- munication Efficiency,” in NeurIPS Workshop on Private Multi-Party Machine Learning (PMPML) , 2016

  2. [2]

    Communication-efficient Learning of Deep Networks from Decen- tralized Data,

    B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient Learning of Deep Networks from Decen- tralized Data,” in International Conference on Artificial Intelligence and Statistics (AISTATS), 2017

  3. [3]

    A Little Is Enough: Circumventing Defenses for Distributed Learning,

    G. Baruch, M. Baruch, and Y . Goldberg, “A Little Is Enough: Circumventing Defenses for Distributed Learning,” in Advances in Neural Information Processing Systems (NeurIPS) , 2019

  4. [4]

    Local Model Poisoning At- tacks to Byzantine-Robust Federated Learning,

    M. Fang, X. Cao, J. Jia, and N. Z. Gong, “Local Model Poisoning At- tacks to Byzantine-Robust Federated Learning,” in USENIX Security Symposium (USENIX Security) , 2020

  5. [5]

    Analyzing Federated Learning through An Adversarial Lens,

    A. N. Bhagoji, S. Chakraborty, P. Mittal, and S. Calo, “Analyzing Federated Learning through An Adversarial Lens,” in International Conference on Machine Learning (ICML) , 2019

  6. [6]

    Manipulating The Byzantine: Optimizing Model Poisoning Attacks and Defenses for Federated Learning,

    V . Shejwalkar and A. Houmansadr, “Manipulating The Byzantine: Optimizing Model Poisoning Attacks and Defenses for Federated Learning,” in Network and Distributed System Security (NDSS) Sym- posium, 2021

  7. [7]

    Mpaf: Model Poisoning Attacks to Fed- erated Learning Based on Fake Clients,

    X. Cao and N. Z. Gong, “Mpaf: Model Poisoning Attacks to Fed- erated Learning Based on Fake Clients,” in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2022

  8. [8]

    How to Backdoor Federated Learning,

    E. Bagdasaryan, A. Veit, Y . Hua, D. Estrin, and V . Shmatikov, “How to Backdoor Federated Learning,” in International Conference on Artificial Intelligence and Statistics (AISTATS) , 2020

  9. [9]

    Data Poisoning At- tacks against Federated Learning Systems,

    V . Tolpegin, S. Truex, M. E. Gursoy, and L. Liu, “Data Poisoning At- tacks against Federated Learning Systems,” in European Symposium on Research in Computer Security (ESORICS) , 2020

  10. [10]

    Dba: Distributed Backdoor Attacks against Federated Learning,

    C. Xie, K. Huang, P. Y . Chen, and B. Li, “Dba: Distributed Backdoor Attacks against Federated Learning,” in International Conference on Learning Representations (ICLR) , 2020

  11. [11]

    Can you really backdoor federated learning?

    Z. Sun, P. Kairouz, A. T. Suresh, and H. B. McMahan, “Can You Really Backdoor Federated Learning?” arXiv preprint arXiv:1911.07963, 2019

  12. [12]

    Attack of The Tails: Yes, You Really Can Backdoor Federated Learning,

    H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J.- y. Sohn, K. Lee, and D. Papailiopoulos, “Attack of The Tails: Yes, You Really Can Backdoor Federated Learning,” in Advances in Neural Information Processing Systems (NeurIPS) , 2020

  13. [13]

    Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

    X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted Backdoor Attacks on Deep Learning Systems using Data Poisoning,” arXiv preprint arXiv:1712.05526, 2017

  14. [14]

    Trojaning Attack on Neural Networks,

    Y . Liu, S. Ma, Y . Aafer, W.-C. Lee, J. Zhai, W. Wang, and X. Zhang, “Trojaning Attack on Neural Networks,” in Network and Distributed System Security (NDSS) Symposium , 2018

  15. [15]

    Towards Practical Deployment-stage Backdoor Attack on Deep Neural Networks,

    X. Qi, T. Xie, R. Pan, J. Zhu, Y . Yang, and K. Bu, “Towards Practical Deployment-stage Backdoor Attack on Deep Neural Networks,” in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

  16. [16]

    Threats to Federated Learning: A Survey,

    L. Lyu, H. Yu, and Q. Yang, “Threats to Federated Learning: A Survey,” arXiv preprint arXiv:2003.02133 , 2020

  17. [17]

    Advances and Open Problems in Federated Learning,

    P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings et al., “Advances and Open Problems in Federated Learning,” Foundations and Trends® in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021

  18. [18]

    A Survey on Security and Privacy of Federated Learning,

    V . Mothukuri, R. M. Parizi, S. Pouriyeh, Y . Huang, A. Dehghantanha, and G. Srivastava, “A Survey on Security and Privacy of Federated Learning,” Future Generation Computer Systems (FGCS) , vol. 115, pp. 619–640, 2021

  19. [19]

    A Survey on Federated Learning: The Journey from Centralized to Distributed On-site Learning and Beyond,

    S. AbdulRahman, H. Tout, H. Ould-Slimane, A. Mourad, C. Talhi, and M. Guizani, “A Survey on Federated Learning: The Journey from Centralized to Distributed On-site Learning and Beyond,” IEEE Internet of Things Journal (IoTJ), vol. 8, no. 7, pp. 5476–5497, 2020

  20. [20]

    Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent,

    P. Blanchard, E. M. El Mhamdi, R. Guerraoui, and J. Stainer, “Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent,” in Advances in Neural Information Processing Systems (NeurIPS), 2017

  21. [21]

    FLTrust: Byzantine- robust Federated Learning via Trust Bootstrapping,

    X. Cao, M. Fang, J. Liu, and N. Z. Gong, “FLTrust: Byzantine- robust Federated Learning via Trust Bootstrapping,” in Network and Distributed System Security (NDSS) Symposium , 2021

  22. [22]

    Signguard: Byzantine- robust Federated Learning through Collaborative Malicious Gradient Filtering,

    J. Xu, S.-L. Huang, L. Song, and T. Lan, “Signguard: Byzantine- robust Federated Learning through Collaborative Malicious Gradient Filtering,” arXiv preprint arXiv:2109.05872 , 2021

  23. [23]

    FLAME: Taming Backdoors in Federated Learning,

    T. D. Nguyen, P. Rieger, R. De Viti, H. Chen, B. B. Brandenburg, H. Yalame, H. Möllering, H. Fereidooni, S. Marchal, M. Mietti- nen et al. , “FLAME: Taming Backdoors in Federated Learning,” in USENIX Security Symposium (USENIX Security) , 2022

  24. [24]

    Deep- sight: Mitigating Backdoor Attacks in Federated Learning through Deep Model Inspection,

    P. Rieger, T. D. Nguyen, M. Miettinen, and A.-R. Sadeghi, “Deep- sight: Mitigating Backdoor Attacks in Federated Learning through Deep Model Inspection,” in Network and Distributed System Security (NDSS) Symposium, 2022

  25. [25]

    SkyMask: Attack-agnostic Robust Fed- erated Learning with Fine-grained Learnable Masks,

    P. Yan, H. Wang, T. Song, Y . Hua, R. Ma, N. Hu, M. R. Haghighat, and H. Guan, “SkyMask: Attack-agnostic Robust Fed- erated Learning with Fine-grained Learnable Masks,” arXiv preprint arXiv:2312.12484, 2023

  26. [26]

    Byzantine-robust Distributed Learning: Towards Optimal Statistical Rrates,

    D. Yin, Y . Chen, R. Kannan, and P. Bartlett, “Byzantine-robust Distributed Learning: Towards Optimal Statistical Rrates,” in Inter- national Conference on Machine Learning (ICML) , 2018

  27. [27]

    The Hidden Vulnerability of Dis- tributed Learning in Byzantium,

    R. Guerraoui, S. Rouault et al. , “The Hidden Vulnerability of Dis- tributed Learning in Byzantium,” in International Conference on Machine Learning (ICML) , 2018

  28. [28]

    Mitigating sybils in federated learning poisoning,

    C. Fung, C. J. Yoon, and I. Beschastnikh, “Mitigating Sybils in Federated Learning Poisoning,” arXiv preprint arXiv:1808.04866 , 2018

  29. [29]

    SparseFed: Mitigating Model Poisoning Attacks in Federated Learning with Sparsification,

    A. Panda, S. Mahloujifar, A. N. Bhagoji, S. Chakraborty, and P. Mit- tal, “SparseFed: Mitigating Model Poisoning Attacks in Federated Learning with Sparsification,” in International Conference on Artifi- cial Intelligence and Statistics (AISTATS) , 2022

  30. [30]

    Siren: Byzantine-robust Federated Learning via Proactive Alarming,

    H. Guo, H. Wang, T. Song, Y . Hua, Z. Lv, X. Jin, Z. Xue, R. Ma, and H. Guan, “Siren: Byzantine-robust Federated Learning via Proactive Alarming,” in ACM Symposium on Cloud Computing (SoCC) , 2021

  31. [31]

    Siren+: Robust Federated Learning with Proactive Alarm- ing and Differential Privacy,

    H. Guo, H. Wang, T. Song, Y . Hua, R. Ma, X. Jin, Z. Xue, and H. Guan, “Siren+: Robust Federated Learning with Proactive Alarm- ing and Differential Privacy,” IEEE Transactions on Dependable and Secure Computing (TDSC) , 2024

  32. [32]

    FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from A Client Perspective,

    J. Sun, A. Li, L. DiValentin, A. Hassanzadeh, Y . Chen, and H. Li, “FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from A Client Perspective,” in Advances in Neural Information Processing Systems (NeurIPS) , 2021

  33. [33]

    FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning,

    K. Zhang, G. Tao, Q. Xu, S. Cheng, S. An, Y . Liu, S. Feng, G. Shen, P.-Y . Chen, S. Ma, and X. Zhang, “FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning,” in International Conference on Learning Representations (ICLR) , 2023

  34. [34]

    LeadFL: Client Self-Defense Against Model Poisoning in Federated Learning,

    C. Zhu, S. Roos, and L. Y . Chen, “LeadFL: Client Self-Defense Against Model Poisoning in Federated Learning,” in International Conference on Machine Learning (ICML) , 2023

  35. [35]

    Zeno: Distributed Stochastic Gra- dient Descent with Suspicion-based Fault-tolerance,

    C. Xie, S. Koyejo, and I. Gupta, “Zeno: Distributed Stochastic Gra- dient Descent with Suspicion-based Fault-tolerance,” in International Conference on Machine Learning (ICML) , 2019

  36. [36]

    Crfl: Certifiably Robust Federated Learning against Backdoor Attacks,

    C. Xie, M. Chen, P.-Y . Chen, and B. Li, “Crfl: Certifiably Robust Federated Learning against Backdoor Attacks,” in International Con- ference on Machine Learning (ICML) , 2021

  37. [37]

    Fedrecover: Recover- ing from Poisoning Attacks in Federated Learning using Historical Information,

    X. Cao, J. Jia, Z. Zhang, and N. Z. Gong, “Fedrecover: Recover- ing from Poisoning Attacks in Federated Learning using Historical Information,” in IEEE Symposium on Security and Privacy (S&P) , 2023

  38. [38]

    Flcert: Provably Secure Federated Learning against Poisoning Attacks,

    X. Cao, Z. Zhang, J. Jia, and N. Z. Gong, “Flcert: Provably Secure Federated Learning against Poisoning Attacks,” IEEE Transactions on Information Forensics and Security (TIFS) , pp. 3691–3705, 2022

  39. [39]

    FLDetector: Defending Federated Learning against Model Poisoning Attacks via Detecting Malicious Clients,

    Z. Zhang, X. Cao, J. Jia, and N. Z. Gong, “FLDetector: Defending Federated Learning against Model Poisoning Attacks via Detecting Malicious Clients,” in ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) , 2022

  40. [40]

    The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

    J. Frankle and M. Carbin, “The Lottery Ticket Hypothe- sis: Finding Sparse, Trainable Neural Networks,” arXiv preprint arXiv:1803.03635, 2018

  41. [41]

    Deep Gradient Com- pression: Reducing the Communication Bandwidth for Distributed Training,

    Y . Lin, S. Han, H. Mao, Y . Wang, and B. Dally, “Deep Gradient Com- pression: Reducing the Communication Bandwidth for Distributed Training,” in International Conference on Learning Representations (ICLR), 2018

  42. [42]

    Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

    S. Han, H. Mao, and W. J. Dally, “Deep Compression: Compress- ing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding,” arXiv preprint arXiv:1510.00149 , 2015

  43. [43]

    Fedltn: Federated Learning for Sparse and Personalized Lottery Ticket Networks,

    V . Mugunthan, E. Lin, V . Gokul, C. Lau, L. Kagal, and S. Pieper, “Fedltn: Federated Learning for Sparse and Personalized Lottery Ticket Networks,” in European Conference on Computer Vision (ECCV), 2022

  44. [44]

    Model Pruning Enables Efficient Federated Learning on Edge Devices,

    Y . Jiang, S. Wang, V . Valls, B. J. Ko, W.-H. Lee, K. K. Leung, and L. Tassiulas, “Model Pruning Enables Efficient Federated Learning on Edge Devices,” IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022

  45. [45]

    Oblivion: Poisoning Federated Learning by Inducing Catastrophic Forgetting,

    C. Zhang, B. Zhou, Z. He, Z. Liu, Y . Chen, W. Xu, and B. Li, “Oblivion: Poisoning Federated Learning by Inducing Catastrophic Forgetting,” in IEEE Conference on Computer Communications (IN- FOCOM), 2023

  46. [46]

    Back to The Drawing Board: A Critical Evaluation of Poisoning Attacks on Production Federated Learning,

    V . Shejwalkar, A. Houmansadr, P. Kairouz, and D. Ramage, “Back to The Drawing Board: A Critical Evaluation of Poisoning Attacks on Production Federated Learning,” in IEEE Symposium on Security and Privacy (S&P) , 2022

  47. [47]

    On The Pitfalls of Security Evaluation of Robust Federated Learning,

    M. A. Khan, V . Shejwalkar, A. Houmansadr, and F. M. Anwar, “On The Pitfalls of Security Evaluation of Robust Federated Learning,” in IEEE Security and Privacy Workshops (SPW) , 2023

  48. [48]

    A Taxonomy of Attacks on Federated Learning,

    M. S. Jere, T. Farnan, and F. Koushanfar, “A Taxonomy of Attacks on Federated Learning,” in IEEE Symposium on Security and Privacy (S&P), 2020

  49. [49]

    Neurotoxin: Durable Backdoors in Federated Learning,

    Z. Zhang, A. Panda, L. Song, Y . Yang, M. Mahoney, P. Mittal, R. Kan- nan, and J. Gonzalez, “Neurotoxin: Durable Backdoors in Federated Learning,” in International Conference on Machine Learning (ICML), 2022

  50. [50]

    MESAS: Poisoning Defense for Feder- ated Learning Resilient against Adaptive Attackers,

    T. Krauß and A. Dmitrienko, “MESAS: Poisoning Defense for Feder- ated Learning Resilient against Adaptive Attackers,” in ACM SIGSAC Conference on Computer and Communications Security (CCS) , 2023

  51. [51]

    Mitigating Backdoor Attacks in Federated Learning,

    C. Wu, X. Yang, S. Zhu, and P. Mitra, “Mitigating Backdoor Attacks in Federated Learning,” arXiv preprint arXiv:2011.01767 , 2020

  52. [52]

    Soteria: Prov- able Defense Against Privacy Leakage in Federated Learning from Representation Perspective,

    J. Sun, A. Li, B. Wang, H. Yang, H. Li, and Y . Chen, “Soteria: Prov- able Defense Against Privacy Leakage in Federated Learning from Representation Perspective,” in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2021

  53. [53]

    Pytorch: An Imperative Style, High-performance Deep Learning Library,

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al. , “Pytorch: An Imperative Style, High-performance Deep Learning Library,” in Ad- vances in Neural Information Processing Systems (NeurIPS) , 2019

  54. [54]

    Imagenet Classifi- cation with Deep Convolutional Neural Networks,

    A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet Classifi- cation with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems (NeurIPS) , 2012

  55. [55]

    The MNIST Database of Handwritten Digits,

    Y . LeCun, “The MNIST Database of Handwritten Digits,”http://yann. lecun. com/exdb/mnist/, 1998

  56. [56]

    Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

    H. Xiao, K. Rasul, and R. V ollgraf, “Fashion-mnist: A Novel Image Dataset for Benchmarking Machine Learning Algorithms,” arXiv preprint arXiv:1708.07747, 2017

  57. [57]

    Learning Multiple Layers of Fea- tures from Tiny Images,

    A. Krizhevsky, G. Hinton et al. , “Learning Multiple Layers of Fea- tures from Tiny Images,” University of Toronto, Tech. Rep., 2009

  58. [58]

    DeFL: Defending against Model Poisoning Attacks in Federated Learning via Critical Learning Periods Awareness,

    G. Yan, H. Wang, X. Yuan, and J. Li, “DeFL: Defending against Model Poisoning Attacks in Federated Learning via Critical Learning Periods Awareness,” in AAAI Conference on Artificial Intelligence (AAAI), 2023

  59. [59]

    Seizing Critical Learning Periods in Federated Learning,

    G. Yan, H. Wang, and J. Li, “Seizing Critical Learning Periods in Federated Learning,” in AAAI Conference on Artificial Intelligence (AAAI), 2022

  60. [60]

    Density-based Clustering Based on Hierarchical Density Estimates,

    R. J. Campello, D. Moulavi, and J. Sander, “Density-based Clustering Based on Hierarchical Density Estimates,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) , 2013. Appendix A. Additional Details of The Baseline Defenses Krum and Multi-Krum (MKrum) [20]. Krum uses a distance score as the metric. In each round, the Krum server s...