Aggressive or Imperceptible, or Both: Network Pruning Assisted Hybrid Byzantines in Federated Learning
Pith reviewed 2026-05-24 01:59 UTC · model grok-4.3
The pith
A hybrid sparse Byzantine attack using neural network sensitivities can bypass eight state-of-the-art federated learning defenses.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a hybrid sparse Byzantine attack, consisting of a sparse attack component that selectively manipulates parameters with higher sensitivity in the NN and a slow-accumulating attack component that silently poisons parameters over multiple rounds, creates a strong but imperceptible attack strategy that can bypass common defences, as demonstrated through extensive simulations against eight state-of-the-art defence mechanisms.
What carries the argument
The hybrid sparse Byzantine attack, which uses side information on neural network parameter sensitivities to coordinate a sparse targeted component with a slow-accumulating component.
If this is right
- The attack degrades global model accuracy in federated learning while evading outlier-based detection.
- Existing defenses that treat malicious updates only as statistical anomalies are insufficient.
- Insights from sparse neural networks enable stronger, more targeted poisoning strategies.
- Aggregation at the parameter server must account for internal neural network structure to remain robust.
Where Pith is reading between the lines
- Federated learning systems may need mechanisms to withhold or obscure parameter sensitivity data from clients.
- Future defenses could integrate sensitivity analysis or pruning awareness to detect hybrid attacks.
- The hybrid approach might extend to other distributed training settings where model internals can be exploited.
Load-bearing premise
That attackers have access to side information on which neural network parameters have higher sensitivity.
What would settle it
An experiment showing the attack fails to bypass the eight defenses when clients lack access to parameter sensitivity information.
Figures
read the original abstract
In federated learning (FL), profiling and verifying each client is inherently difficult, which introduces a significant security vulnerability: malicious clients, commonly referred to as Byzantines, can degrade the accuracy of the global model by submitting poisoned updates during training. To mitigate this, the aggregation process at the parameter server must be robust against such adversarial behaviour. Most existing defences approach the Byzantine problem from an outlier detection perspective, treating malicious updates as statistical anomalies and ignoring the internal structure of the trained neural network (NN). Motivated by this, this work highlights the potential of leveraging side information tied to the NN architecture to design stronger, more targeted attacks. In particular, inspired by insights from sparse NNs, we introduce a hybrid sparse Byzantine attack. The attack consists of two coordinated components: (i) A sparse attack component that selectively manipulates parameters with higher sensitivity in the NN, aiming to cause maximum disruption with minimal visibility; (ii) A slow-accumulating attack component that silently poisons parameters over multiple rounds to evade detection. Together, these components create a strong but imperceptible attack strategy that can bypass common defences. We evaluate the proposed attack through extensive simulations and demonstrate its effectiveness against eight state-of-the-art defence mechanisms.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a hybrid sparse Byzantine attack in federated learning that combines (i) a sparse attack component selectively targeting high-sensitivity parameters (inspired by network pruning insights) with (ii) a slow-accumulating component over multiple rounds. The central claim is that this produces a strong yet imperceptible attack capable of bypassing eight state-of-the-art defenses, as demonstrated via simulations.
Significance. If the attack can be realized under standard FL threat models and the simulation results are reproducible, the work would highlight a structural weakness in current outlier-based defenses and motivate defenses that incorporate model architecture. The sparsity-motivated targeting is a potentially useful angle, but the absence of experimental details prevents assessment of whether the claimed bypass holds.
major comments (2)
- [Abstract] Abstract: the claim that the hybrid attack 'can bypass common defences' and is 'demonstrated against eight state-of-the-art defence mechanisms' is load-bearing, yet the abstract (and available text) provides no information on experimental setup, datasets, metrics, baselines, number of rounds, or potential confounds, rendering the central effectiveness claim unverifiable.
- [Threat Model / Attack Definition] Threat model / attack construction: the sparse component presupposes that the attacker can obtain a reliable per-parameter sensitivity ranking. In the standard FL threat model a malicious client receives only the current global weights and its own local data; the manuscript does not specify or demonstrate a local procedure that produces a faithful sensitivity map without extra non-local information. This assumption is load-bearing for the 'imperceptible yet effective' property.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the need for greater clarity in the abstract and threat model. We address each major comment below and will revise the manuscript to strengthen these aspects.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the hybrid attack 'can bypass common defences' and is 'demonstrated against eight state-of-the-art defence mechanisms' is load-bearing, yet the abstract (and available text) provides no information on experimental setup, datasets, metrics, baselines, number of rounds, or potential confounds, rendering the central effectiveness claim unverifiable.
Authors: We agree the abstract should be self-contained to support its central claims. The full manuscript (Section 4) specifies the experimental setup, including datasets (MNIST, CIFAR-10), models (LeNet, ResNet-18), 100 clients with 10% malicious, 200 communication rounds, metrics (test accuracy, attack success rate), and the eight defenses (Krum, Median, Trimmed Mean, Bulyan, FLTrust, FoolsGold, RFA, and Multi-Krum). To address the concern, we will revise the abstract to concisely include key setup elements and the list of defenses evaluated. revision: yes
-
Referee: [Threat Model / Attack Definition] Threat model / attack construction: the sparse component presupposes that the attacker can obtain a reliable per-parameter sensitivity ranking. In the standard FL threat model a malicious client receives only the current global weights and its own local data; the manuscript does not specify or demonstrate a local procedure that produces a faithful sensitivity map without extra non-local information. This assumption is load-bearing for the 'imperceptible yet effective' property.
Authors: This is a substantive point on the threat model. The manuscript assumes the attacker has the model architecture (standard in FL, as the global model is broadcast) and can compute sensitivity locally. We will add a dedicated subsection in the revised version detailing a local procedure: the attacker uses the magnitude of parameters in the received global model combined with the norm of local gradients computed on its own data (adapted from pruning sensitivity metrics like those in SNIP). This requires no non-local information beyond what is available to any client. We will also include a brief validation showing the local ranking correlates with global impact. revision: yes
Circularity Check
No circularity; attack defined from external sparse-NN insights and evaluated empirically
full rationale
The paper constructs a hybrid sparse+accumulating Byzantine attack by explicitly defining its two components from external sparse-NN literature (sensitivity ranking and slow poisoning). No equations or claims reduce the attack definition to its own outputs by construction, no fitted parameters are relabeled as predictions, and no load-bearing uniqueness theorems or ansatzes are imported via self-citation. The central claim is an empirical demonstration against eight independent defenses; the derivation chain is self-contained and does not collapse to tautology.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Malicious clients (Byzantines) exist and can submit arbitrary poisoned updates in federated learning
- domain assumption Neural network parameter sensitivities can be identified and leveraged by attackers as side information
invented entities (2)
-
sparse attack component
no independent evidence
-
slow-accumulating attack component
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Communication-efficient learning of deep networks from decentralized data,
B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inAISTATS, 2017
work page 2017
-
[2]
Federated learning: Strategies for improving com- munication efficiency,
J. Kone ˇcn`y, H. B. McMahan, F. X. Yu, P. Richt ´arik, A. T. Suresh, and D. Bacon, “Federated learning: Strategies for improving com- munication efficiency,”NIPS workshop on Private Multiparty Ma- chine Learning, 2016
work page 2016
-
[3]
How to backdoor federated learning,
E. Bagdasaryan, A. Veit, Y . Hua, D. Estrin, and V . Shmatikov, “How to backdoor federated learning,” inAISTATS, 2020
work page 2020
-
[4]
Analyzing federated learning through an adversarial lens,
A. N. Bhagoji, S. Chakraborty, P. Mittal, and S. Calo, “Analyzing federated learning through an adversarial lens,” inICML, 2019
work page 2019
-
[5]
The limitations of federated learning in sybil settings
C. Fung, C. J. Yoon, and I. Beschastnikh, “The limitations of federated learning in sybil settings.” inUsenix RAID, 2020
work page 2020
-
[6]
Can you really backdoor federated learning?
Z. Sun, P. Kairouz, A. T. Suresh, and H. B. McMahan, “Can you really backdoor federated learning?”arXiv preprint arXiv:1911.07963, 2019
-
[7]
Dba: Distributed backdoor attacks against federated learning,
C. Xie, K. Huang, P.-Y . Chen, and B. Li, “Dba: Distributed backdoor attacks against federated learning,” inICLR, 2020
work page 2020
-
[8]
Machine learning with adversaries: Byzantine tolerant gradient descent,
P. Blanchard, E. M. El Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” inNIPS, 2017
work page 2017
-
[9]
Fall of empires: Breaking byzantine-tolerant sgd by inner product manipulation,
C. Xie, O. Koyejo, and I. Gupta, “Fall of empires: Breaking byzantine-tolerant sgd by inner product manipulation,” inUncer- tainty in Artificial Intelligence, 2020
work page 2020
-
[10]
A little is enough: Cir- cumventing defenses for distributed learning,
G. Baruch, M. Baruch, and Y . Goldberg, “A little is enough: Cir- cumventing defenses for distributed learning,” inNeurIPS, 2019
work page 2019
-
[11]
Robust aggregation for federated learning,
K. Pillutla, S. M. Kakade, and Z. Harchaoui, “Robust aggregation for federated learning,”IEEE Transactions on Signal Processing, 2022
work page 2022
-
[12]
Byzantine-robust distributed learning: Towards optimal statistical rates,
D. Yin, Y . Chen, R. Kannan, and P. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” inICML, 2018
work page 2018
-
[13]
The hidden vulnerability of distributed learning in byzantium,
E. M. El Mhamdi, R. Guerraoui, and S. Rouault, “The hidden vulnerability of distributed learning in byzantium,” inICML, 2018
work page 2018
-
[14]
Byzantine-robust federated machine learning through adaptive model averaging,
L. Mu ˜noz-Gonz´alez, K. T. Co, and E. C. Lupu, “Byzantine-robust federated machine learning through adaptive model averaging,” arXiv preprint arXiv:1909.05125, 2019
-
[15]
FedSecurity: A benchmark for attacks and defenses in federated learning and federated LLMs,
S. Han, B. Buyukates, Z. Hu, H. Jin, W. Jin, L. Sun, X. Wang, W. Wu, C. Xie, Y . Yao, K. Zhang, Q. Zhang, Y . Zhang, C. Joe- Wong, S. Avestimehr, and C. He, “FedSecurity: A benchmark for attacks and defenses in federated learning and federated LLMs,” in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024, p. 5070–5081
work page 2024
-
[16]
S. Han, W. Wu, B. Buyukates, W. Jin, Q. Zhang, Y . Yao, S. Aves- timehr, and C. He, “Kick bad guys out! conditionally activated anomaly detection in federated learning with zero-knowledge proof verification,”arXiv preprint arXiv:2310.04055, 2023
-
[17]
Distributed training with heterogeneous data: Bridging median-and mean- based algorithms,
X. Chen, T. Chen, H. Sun, S. Z. Wu, and M. Hong, “Distributed training with heterogeneous data: Bridging median-and mean- based algorithms,”Advances in Neural Information Processing Systems, vol. 33, pp. 21 616–21 626, 2020
work page 2020
-
[18]
Learning from history for byzantine robust optimization,
S. P. Karimireddy, L. He, and M. Jaggi, “Learning from history for byzantine robust optimization,” inICML, 2021
work page 2021
-
[19]
Can decentralized learning be more robust than federated learning?
M. Raynal, D. Pasquini, and C. Troncoso, “Can decentralized learning be more robust than federated learning?”arXiv, 2023
work page 2023
-
[20]
Genuinely distributed byzantine machine learning,
E.-M. El-Mhamdi, R. Guerraoui, A. Guirguis, L. N. Hoang, and S. Rouault, “Genuinely distributed byzantine machine learning,” ser. PODC ’20. Association for Computing Machinery, 2020
work page 2020
-
[21]
AnO(log3/2 n) parallel time population protocol for majority withO(log n) states
N. Gupta and N. H. Vaidya, “Fault-tolerance in distributed optimization: The case of redundancy,” inProceedings of the 39th Symposium on Principles of Distributed Computing, ser. PODC ’20. New York, NY , USA: Association for Computing Machinery, 2020, p. 365–374. [Online]. Available: https://doi.org/10.1145/3382734.3405748
-
[22]
Fltrust: Byzantine-robust federated learning via trust bootstrapping,
X. Cao, M. Fang, J. Liu, and N. Z. Gong, “Fltrust: Byzantine-robust federated learning via trust bootstrapping,” inNDSS, 2021
work page 2021
-
[23]
Zeno: Distributed stochastic gradient descent with suspicion-based fault-tolerance,
C. Xie, S. Koyejo, and I. Gupta, “Zeno: Distributed stochastic gradient descent with suspicion-based fault-tolerance,” inICML, 2019
work page 2019
-
[24]
Mixed nash for robust federated learning,
W. Xie, T. Pethick, A. Ramezani-Kebrya, and V . Cevher, “Mixed nash for robust federated learning,”TMLR, 2023
work page 2023
-
[26]
Local model poisoning attacks to byzantine-robust federated learning,
M. Fang, X. Cao, J. Jia, and N. Z. Gong, “Local model poisoning attacks to byzantine-robust federated learning,” inUSENIX Con- ference on Security Symposium, 2020
work page 2020
-
[27]
Byzantine-robust learning on heterogeneous datasets via bucketing,
S. P. Karimireddy, L. He, and M. Jaggi, “Byzantine-robust learning on heterogeneous datasets via bucketing,” inICLR, 2022
work page 2022
-
[28]
E. Gorbunov, S. Horv ´ath, P. Richt ´arik, and G. Gidel, “Variance reduction is an antidote to byzantines: Better rates, weaker assump- tions and communication compression as a cherry on the top,” in ICLR, 2023
work page 2023
-
[29]
Byzantine-robust variance- reduced federated learning over distributed non-i.i.d. data,
J. Peng, Z. Wu, Q. Ling, and T. Chen, “Byzantine-robust variance- reduced federated learning over distributed non-i.i.d. data,”Infor- mation Sciences, 2022
work page 2022
-
[30]
Federated variance- reduced stochastic gradient descent with robustness to byzantine attacks,
Z. Wu, Q. Ling, T. Chen, and G. B. Giannakis, “Federated variance- reduced stochastic gradient descent with robustness to byzantine attacks,”IEEE Transactions on Signal Processing, 2020
work page 2020
-
[31]
H. Zhu and Q. Ling, “Byzantine-robust aggregation with gradi- ent difference compression and stochastic variance reduction for federated learning,” inICASSP, 2022
work page 2022
-
[32]
Variance reduction-boosted byzantine robustness in decentralized stochastic optimization,
J. Peng, W. Li, and Q. Ling, “Variance reduction-boosted byzantine robustness in decentralized stochastic optimization,” inICASSP, 2022
work page 2022
-
[33]
Byzantines can also learn from history: Fall of centered clipping in federated learning,
K. ¨Ozfatura, E. ¨Ozfatura, A. K ¨upc ¸¨u, and D. Gunduz, “Byzantines can also learn from history: Fall of centered clipping in federated learning,”IEEE Transactions on Information Forensics and Secu- rity, vol. 19, pp. 2010–2022, 2024
work page 2010
-
[34]
Byzantine machine learning made easy by resilient averaging of momentums,
S. Farhadkhani, R. Guerraoui, N. Gupta, R. Pinot, and J. Stephan, “Byzantine machine learning made easy by resilient averaging of momentums,” inICML, 2022
work page 2022
-
[35]
Distributed momentum for byzantine-resilient stochastic gradient descent,
E.-M. El-Mhamdi, R. Guerraoui, and S. Rouault, “Distributed momentum for byzantine-resilient stochastic gradient descent,” in ICLR, 2021
work page 2021
-
[36]
Some methods of speeding up the convergence of iteration methods,
B. Polyak, “Some methods of speeding up the convergence of iteration methods,”USSR Computational Mathematics and Math- ematical Physics, 1964
work page 1964
-
[37]
Adam: A method for stochastic opti- mization,
D. P. Kingma and J. Ba, “Adam: A method for stochastic opti- mization,”ICLR, 2015
work page 2015
-
[38]
On the impor- tance of initialization and momentum in deep learning,
I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the impor- tance of initialization and momentum in deep learning,” inICML, 2013
work page 2013
-
[39]
Fedadc: Accelerated federated learning with drift control,
E. Ozfatura, K. Ozfatura, and D. G ¨und¨uz, “Fedadc: Accelerated federated learning with drift control,” inISIT, 2021, pp. 467–472
work page 2021
-
[40]
Federated optimization in heterogeneous networks,
T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” Proceedings of Machine learning and systems, vol. 2, pp. 429– 450, 2020
work page 2020
-
[41]
Federated learning based on dynamic regular- ization,
D. A. E. Acar, Y . Zhao, R. Matas, M. Mattina, P. Whatmough, and V . Saligrama, “Federated learning based on dynamic regular- ization,” inICLR, 2021
work page 2021
-
[42]
Feddc: Federated learning with non-iid data via local drift decoupling and correction,
L. Gao, H. Fu, L. Li, Y . Chen, M. Xu, and C.-Z. Xu, “Feddc: Federated learning with non-iid data via local drift decoupling and correction,” inCVPR, 2022
work page 2022
-
[43]
Slowmo: Improving communication-efficient distributed SGD with slow mo- mentum,
J. Wang, V . Tantia, N. Ballas, and M. G. Rabbat, “Slowmo: Improving communication-efficient distributed SGD with slow mo- mentum,”ICLR, 2020
work page 2020
-
[44]
Scaffold: Stochastic controlled averaging for federated learning,
S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold: Stochastic controlled averaging for federated learning,” inICML, 2020
work page 2020
-
[45]
A method for solving the convex programming problem with convergence rate𝑜(1/𝑘 2 ),
Y . Nesterov, “A method for solving the convex programming problem with convergence rate𝑜(1/𝑘 2 ),”Proceedings of the USSR Academy of Sciences, 1983
work page 1983
-
[46]
Byzantine- robust decentralized learning via self-centered clipping,
L. H. andfan Sai Praneeth Karimireddy and M. Jaggi, “Byzantine- robust decentralized learning via self-centered clipping,”ArXiv, 2022
work page 2022
-
[47]
The byzantine generals problem,
L. Lamport, R. Shostak, and M. Pease, “The byzantine generals problem,”ACM Trans. Program. Lang. Syst., 1982
work page 1982
-
[48]
signsgd with majority vote is communication efficient and fault tolerant,
J. Bernstein, J. Zhao, K. Azizzadenesheli, and A. Anandkumar, “signsgd with majority vote is communication efficient and fault tolerant,” inICLR, 2019
work page 2019
-
[49]
Sign- based gradient descent with heterogeneous data: Convergence and byzantine resilience,
R. Jin, Y . Liu, Y . Huang, X. He, T. Wu, and H. Dai, “Sign- based gradient descent with heterogeneous data: Convergence and byzantine resilience,”TNNLS, 2024
work page 2024
-
[50]
Byzantine- robust learning on heterogeneous data via gradient splitting,
Y . Liu, C. Chen, L. Lyu, F. Wu, S. Wu, and G. Chen, “Byzantine- robust learning on heterogeneous data via gradient splitting,” in ICML, 2023
work page 2023
-
[51]
S. Liu, T. Chen, X. Chen, L. Shen, D. C. Mocanu, Z. Wang, and M. Pechenizkiy, “The unreasonable effectiveness of random pruning: Return of the most naive baseline for sparse training,” in International Conference on Learning Representations, 2022
work page 2022
-
[52]
Progressive skeletonization: Trimming more fat from a network at initialization,
P. de e, A. Sanyal, H. Behl, P. Torr, G. Rogez, and P. K. Dokania, “Progressive skeletonization: Trimming more fat from a network at initialization,” inICLR, 2021
work page 2021
-
[53]
Pruning neural networks without any data by iteratively conserving synaptic flow,
H. Tanaka, D. Kunin, D. L. Yamins, and S. Ganguli, “Pruning neural networks without any data by iteratively conserving synaptic flow,” inNeurips, 2020
work page 2020
-
[54]
Powersgd: Practi- cal low-rank gradient compression for distributed optimization,
T. V ogels, S. P. Karimireddy, and M. Jaggi, “Powersgd: Practi- cal low-rank gradient compression for distributed optimization,” Neurips, vol. 32, 2019
work page 2019
-
[55]
Efficient lottery ticket finding: Less data is more,
Z. Zhang, X. Chen, T. Chen, and Z. Wang, “Efficient lottery ticket finding: Less data is more,” inICML, 2021
work page 2021
-
[56]
Group fisher pruning for practical network compression,
L. Liu, S. Zhang, Z. Kuang, A. Zhou, J.-H. Xue, X. Wang, Y . Chen, W. Yang, Q. Liao, and W. Zhang, “Group fisher pruning for practical network compression,” inICML, 2021
work page 2021
-
[57]
Rare gems: Finding lottery tickets at initialization,
K. Sreenivasan, J. yong Sohn, L. Yang, M. Grinde, A. Nagle, H. Wang, E. Xing, K. Lee, and D. Papailiopoulos, “Rare gems: Finding lottery tickets at initialization,” inNeurips, 2022
work page 2022
-
[58]
Dual lottery ticket hypothesis,
Y . Bai, H. Wang, Z. TAO, K. Li, and Y . Fu, “Dual lottery ticket hypothesis,” inICLR, 2022
work page 2022
-
[59]
Layer-adaptive sparsity for the magnitude-based pruning,
J. Lee, S. Park, S. Mo, S. Ahn, and J. Shin, “Layer-adaptive sparsity for the magnitude-based pruning,” inICLR, 2021
work page 2021
-
[60]
Poisoning attacks against support vector machines,
B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines,” inICML, 2012
work page 2012
-
[61]
Cifar-10 (canadian institute for advanced research)
A. Krizhevsky, V . Nair, and G. Hinton, “Cifar-10 (canadian institute for advanced research).”
-
[62]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016
work page 2016
-
[63]
V . Shejwalkar, A. Houmansadr, P. Kairouz, and D. Ramage, “Back to the drawing board: A critical evaluation of poisoning attacks on production federated learning,” inIEEE SP, 2022
work page 2022
-
[64]
Robust federated learning with attack- adaptive aggregation,
C. P. Wan and Q. Chen, “Robust federated learning with attack- adaptive aggregation,”arXiv, 2021
work page 2021
-
[65]
A. Gupta, T. Luo, M. V . Ngo, and S. K. Das, “Long-short history of gradients is all you need: Detecting malicious and unreliable clients in federated learning,” inESORICS, 2022
work page 2022
-
[66]
Defending against data poisoning attack in federated learning with non-iid data,
C. Yin and Q. Zeng, “Defending against data poisoning attack in federated learning with non-iid data,”IEEE Transactions on Computational Social Systems, 2023
work page 2023
-
[67]
Estimating a dirichlet distribution,
T. Minka, “Estimating a dirichlet distribution,” 2000
work page 2000
-
[68]
L. Li, W. Xu, T. Chen, G. B. Giannakis, and Q. Ling, “Rsa: Byzantine-robust stochastic aggregation methods for distributed learning from heterogeneous datasets,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 1544–1551
work page 2019
-
[69]
Achieving byzantine-resilient feder- ated learning via layer-adaptive sparsified model aggregation,
J. Xu, Z. Zhang, and R. Hu, “Achieving byzantine-resilient feder- ated learning via layer-adaptive sparsified model aggregation,” in 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2025, pp. 1508–1517
work page 2025
-
[70]
Do we really need to design new byzantine-robust aggregation rules?
M. Fang, S. Nabavirazavi, Z. Liu, W. Sun, S. S. Iyengar, and H. Yang, “Do we really need to design new byzantine-robust aggregation rules?” inNDSS, 2025
work page 2025
-
[71]
H. Mostafa and X. Wang, “Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameteriza- tion,” inInternational Conference on Machine Learning. PMLR, 2019, pp. 4646–4655
work page 2019
-
[72]
D. C. Mocanu, E. Mocanu, P. Stone, P. H. Nguyen, M. Gibescu, and A. Liotta, “Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science,”Nature communications, vol. 9, no. 1, p. 2383, 2018
work page 2018
-
[73]
SNIP: SINGLE-SHOT NET- WORK PRUNING BASED ON CONNECTION SENSITIVITY ,
N. Lee, T. Ajanthan, and P. Torr, “SNIP: SINGLE-SHOT NET- WORK PRUNING BASED ON CONNECTION SENSITIVITY ,” inICLR, 2019
work page 2019
-
[74]
The lottery ticket hypothesis: Finding sparse, trainable neural networks,
J. Frankle and M. Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural networks,” inICLR, 2019
work page 2019
-
[75]
Pruning Convolutional Neural Networks for Resource Efficient Inference
P. Molchanov, S. Tyree, T. Karras, T. Aila, and J. Kautz, “Pruning convolutional neural networks for resource efficient inference,” arXiv preprint arXiv:1611.06440, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[76]
Advances and open problems in federated learning,
P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cum- mingset al., “Advances and open problems in federated learning,” Foundations and trends® in machine learning, vol. 14, no. 1–2, pp. 1–210, 2021
work page 2021
-
[77]
Rigging the lottery: Making all tickets winners,
U. Evci, T. Gale, J. Menick, P. S. Castro, and E. Elsen, “Rigging the lottery: Making all tickets winners,” inICML, 2020
work page 2020
-
[78]
Manipulating the byzantine: Optimizing model poisoning attacks and defenses for federated learning,
V . Shejwalkar and A. Houmansadr, “Manipulating the byzantine: Optimizing model poisoning attacks and defenses for federated learning,” inNDSS, 2021
work page 2021
-
[79]
Y . Xie, M. Fang, and N. Z. Gong, “Fedredefense: Defending against model poisoning attacks for federated learning using model update reconstruction error.” International Conference on Machine Learning, 2024
work page 2024
-
[80]
Fl-defender: Combating targeted attacks in federated learning,
N. M. Jebreel and J. Domingo-Ferrer, “Fl-defender: Combating targeted attacks in federated learning,”Knowledge-Based Systems, vol. 260, p. 110178, 2023
work page 2023
-
[81]
{FLAME}: Taming backdoors in federated learning,
T. D. Nguyen, P. Rieger, H. Chen, H. Yalame, H. M¨ollering, H. Fer- eidooni, S. Marchal, M. Miettinen, A. Mirhoseini, S. Zeitouni et al., “{FLAME}: Taming backdoors in federated learning,” in 31st USENIX security symposium (USENIX Security 22), 2022, pp. 1415–1432
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.