pith. sign in

arxiv: 2604.14663 · v1 · submitted 2026-04-16 · 💻 cs.CR

EdgeDetect: Importance-Aware Gradient Compression with Homomorphic Aggregation for Federated Intrusion Detection

Pith reviewed 2026-05-10 11:22 UTC · model grok-4.3

classification 💻 cs.CR
keywords federated learningintrusion detectiongradient compressionhomomorphic encryptionedge computingIoT securityprivacy-preserving MLpoisoning attacks
0
0 comments X

The pith

EdgeDetect binarizes gradients to cut federated IDS communication by 96.9 percent while matching centralized accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents EdgeDetect as a federated learning approach for intrusion detection that compresses local model updates through median-based binarization of gradients into plus or minus one values. This step reduces uplink data volume by a factor of 32 and pairs with Paillier homomorphic encryption to keep individual updates hidden from the server. Experiments on the CIC-IDS2017 dataset with 2.8 million flows across seven attack classes show 98 percent multi-class accuracy and 97.9 percent macro F1-score, comparable to full-precision centralized training. The system also runs on Raspberry Pi hardware with low memory and energy use and holds 87 percent accuracy under 5 percent poisoning attacks. If the claims hold, collaborative detection becomes feasible in bandwidth-limited 6G-IoT settings without raw data sharing or excessive communication costs.

Core claim

EdgeDetect shows that median-based statistical binarization of gradients to {+1, -1} representations compresses local updates by 32 times while preserving convergence, and that Paillier homomorphic encryption over these binarized gradients enables secure aggregation against honest-but-curious servers. On CIC-IDS2017 the method reaches 98.0 percent multi-class accuracy and 97.9 percent macro F1-score, matching centralized baselines, cuts per-round communication from 450 MB to 14 MB, and sustains 87 percent accuracy under 5 percent poisoning with severe class imbalance.

What carries the argument

Gradient smartification, the median-based statistical binarization that converts local gradient updates to binary {+1, -1} representations before homomorphic encryption and aggregation.

Load-bearing premise

The median-based statistical binarization of gradients to {+1, -1} preserves model convergence and final accuracy with negligible loss, and the homomorphic aggregation does not introduce new vulnerabilities or accuracy drops beyond those stated.

What would settle it

An experiment in which binarized gradients cause multi-class accuracy to fall below 90 percent on CIC-IDS2017 or in which an honest-but-curious server recovers individual client updates from the encrypted aggregates would falsify the central claims.

Figures

Figures reproduced from arXiv: 2604.14663 by Noor Islam S. Mohammad.

Figure 1
Figure 1. Figure 1: Comparative performance under two hyperparameter [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Model analysis and classification performance. (a) [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 5
Figure 5. Figure 5: Binary confusion matrices: Model 2 reduces false [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 4
Figure 4. Figure 4: Per-class metrics for classical ML baselines: Random [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 8
Figure 8. Figure 8: Confusion matrices: Random Forest (diagonal domi [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 7
Figure 7. Figure 7: Classical classifiers on CIC-IDS2017: KNN shows [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: ROC curve analysis: Model comparison across configu [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 12
Figure 12. Figure 12: Recall curve analysis: Detection performance across [PITH_FULL_IMAGE:figures/full_fig_p010_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Recall-precision trade-off analysis across configura [PITH_FULL_IMAGE:figures/full_fig_p011_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Logistic regression recall analysis: Threshold [PITH_FULL_IMAGE:figures/full_fig_p011_14.png] view at source ↗
read the original abstract

Federated learning (FL) enables collaborative intrusion detection without raw data exchange, but conventional FL incurs high communication overhead from full-precision gradient transmission and remains vulnerable to gradient inference attacks. This paper presents EdgeDetect, a communication-efficient and privacy-aware federated IDS for bandwidth-constrained 6G-IoT environments. EdgeDetect introduces gradient smartification, a median-based statistical binarization that compresses local updates to $\{+1,-1\}$ representations, reducing uplink payload by $32\times$ while preserving convergence. We further integrate Paillier homomorphic encryption over binarized gradients, protecting against honest-but-curious servers without exposing individual updates. Experiments on CIC-IDS2017 (2.8M flows, 7 attack classes) demonstrate $98.0\%$ multi-class accuracy and $97.9\%$ macro F1-score, matching centralized baselines, while reducing per-round communication from $450$~MB to $14$~MB ($96.9\%$ reduction). Raspberry Pi-4 deployment confirms edge feasibility: $4.2$~MB memory, $0.8$~ms latency, and $12$~mJ per inference with $<0.5\%$ accuracy loss. Under $5\%$ poisoning attacks and severe imbalance, EdgeDetect maintains $87\%$ accuracy and $0.95$ minority class F1 ($p<0.001$), establishing a practical accuracy, communication, and privacy tradeoff for next-generation edge intrusion detection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript presents EdgeDetect, a federated learning system for intrusion detection that applies median-based statistical binarization ('gradient smartification') to compress local gradients to {+1, -1} representations, integrates Paillier homomorphic encryption for privacy against honest-but-curious servers, and reports experimental results on the CIC-IDS2017 dataset showing 98.0% multi-class accuracy and 97.9% macro F1-score (matching centralized baselines) with per-round communication reduced from 450 MB to 14 MB (96.9% reduction). It further includes Raspberry Pi-4 edge deployment metrics and robustness evaluation under 5% poisoning attacks.

Significance. If the reported communication savings hold after accounting for encryption overhead and the binarization step is shown to preserve convergence, the work offers a concrete, deployable tradeoff between accuracy, bandwidth, and privacy for federated IDS in constrained 6G-IoT settings. The explicit edge-hardware evaluation and attack-resilience tests are positive contributions that ground the claims in practical conditions.

major comments (3)
  1. [Abstract] Abstract: The headline claim of a 96.9% per-round communication reduction (450 MB to 14 MB) is presented immediately after describing the Paillier homomorphic encryption integration, yet no details on ciphertext expansion, packing, batching, or key size are supplied. Standard Paillier implementations expand even 1-bit values by 2000–4000×, which directly challenges whether the end-to-end encrypted protocol achieves the stated efficiency.
  2. [Gradient smartification] Gradient smartification description: The assertion that median-based binarization to {+1, -1} 'preserves convergence' and incurs 'negligible loss' is stated without any derivation, convergence analysis, or reference to prior theoretical results on sign-based or binarized gradient methods. This assumption underpins both the accuracy-matching claim and the compression factor.
  3. [Experiments] Experiments section: The multi-class accuracy (98.0%) and macro F1 (97.9%) figures are reported without error bars, standard deviations across runs, or ablation on the binarization threshold/median choice, weakening the ability to assess whether the results reliably match centralized baselines.
minor comments (2)
  1. [Abstract] The term 'gradient smartification' is used in the abstract and title without an immediate inline definition or citation; adding a one-sentence gloss would aid readability.
  2. [Deployment] Raspberry Pi-4 deployment numbers (4.2 MB memory, 0.8 ms latency, 12 mJ) are useful but would be strengthened by explicit comparison to a non-compressed or non-encrypted baseline on the same hardware.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which have identified important areas for clarification and strengthening in our manuscript. We address each major comment point by point below and have made revisions to improve the presentation, rigor, and completeness of the work.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The headline claim of a 96.9% per-round communication reduction (450 MB to 14 MB) is presented immediately after describing the Paillier homomorphic encryption integration, yet no details on ciphertext expansion, packing, batching, or key size are supplied. Standard Paillier implementations expand even 1-bit values by 2000–4000×, which directly challenges whether the end-to-end encrypted protocol achieves the stated efficiency.

    Authors: We agree that the abstract presentation could be misleading without explicit qualification of the encryption overhead. The 14 MB figure refers to the size after gradient smartification (binarization) but prior to encryption. In the revised manuscript, we have updated the abstract to state this distinction clearly and added a new paragraph in the methodology section describing the Paillier setup (1024-bit keys), the use of packing to batch multiple binarized values into fewer ciphertexts, and the resulting end-to-end encrypted communication volume. With these measures, the encrypted payload remains substantially smaller than encrypting uncompressed gradients, preserving the practical efficiency gains for edge devices. We have also included a brief note on the net reduction achieved in the encrypted protocol. revision: yes

  2. Referee: [Gradient smartification] Gradient smartification description: The assertion that median-based binarization to {+1, -1} 'preserves convergence' and incurs 'negligible loss' is stated without any derivation, convergence analysis, or reference to prior theoretical results on sign-based or binarized gradient methods. This assumption underpins both the accuracy-matching claim and the compression factor.

    Authors: We acknowledge that the original submission relied primarily on empirical validation without providing theoretical grounding or citations. This is a valid criticism. In the revision, we have added references to prior work on sign-based and binarized gradient methods (including convergence analyses under bounded gradient assumptions) and inserted a concise derivation in the appendix demonstrating that median binarization approximates the sign of the true gradient with high probability when batch sizes are moderate to large. We also discuss why this leads to negligible impact on convergence for the federated IDS setting, supported by the observed matching of centralized baselines. revision: yes

  3. Referee: [Experiments] Experiments section: The multi-class accuracy (98.0%) and macro F1 (97.9%) figures are reported without error bars, standard deviations across runs, or ablation on the binarization threshold/median choice, weakening the ability to assess whether the results reliably match centralized baselines.

    Authors: We concur that the absence of statistical measures and ablations limits the strength of the empirical claims. We have rerun all experiments across 5 independent trials with varied random seeds and will report means accompanied by standard deviations in the revised results (e.g., multi-class accuracy 98.0% ± 0.3%). Additionally, we have incorporated an ablation study comparing the median threshold against mean-based and fixed-threshold binarization, confirming that the median yields the best accuracy-compression tradeoff. These updates appear in an expanded experiments section with a new table. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical system with dataset-tied claims

full rationale

The paper is an empirical system description whose central claims (98.0% accuracy, 96.9% communication reduction, robustness under poisoning) are presented as measured outcomes on the public CIC-IDS2017 dataset rather than any mathematical derivation, fitted parameter renamed as prediction, or self-referential definition. No equations appear in the provided text, and no load-bearing step reduces by construction to its own inputs or to a self-citation chain. The work therefore contains no circularity of the enumerated kinds.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only review yields minimal ledger; no explicit free parameters or invented entities are named, and the central claims rest on standard federated-learning convergence assumptions plus the unstated premise that median binarization is lossless enough for the reported accuracy.

axioms (2)
  • domain assumption Binarized gradients via median statistics preserve convergence behavior of the underlying federated model
    Invoked by the claim that 32x compression occurs 'while preserving convergence' and final accuracy matches centralized training.
  • domain assumption Paillier homomorphic encryption over binarized gradients adds privacy without altering aggregation correctness or model performance
    Stated as protecting against honest-but-curious servers while still achieving the reported accuracy.

pith-pipeline@v0.9.0 · 5560 in / 1671 out tokens · 98544 ms · 2026-05-10T11:22:57.726714+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages

  1. [1]

    Advances and open problems in federated learning,

    P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummingset al., “Advances and open problems in federated learning,”Foundations and Trends in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021

  2. [2]

    Federated deep learning for intrusion detection with differential privacy,

    Y . Liu, J. Zhang, and H. V . Poor, “Federated deep learning for intrusion detection with differential privacy,”IEEE Transactions on Information Forensics and Security, vol. 18, pp. 3291–3306, 2023

  3. [3]

    Federated machine learning: Concept and applications,

    Q. Yang, Y . Liu, T. Chen, and Y . Tong, “Federated machine learning: Concept and applications,”ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, pp. 1–19, 2019

  4. [4]

    Federated learning-based anomaly detection for iot security attacks,

    V . Mothukuri, P. Khare, R. M. Parizi, S. Pouriyeh, A. Dehghantanha, and G. Srivastava, “Federated learning-based anomaly detection for iot security attacks,”IEEE Internet of Things Journal, vol. 9, no. 4, pp. 2545–2554, 2021

  5. [5]

    Federated learning for intrusion detection system: Concepts, challenges and future directions,

    T. D. Nguyen, P. Rieger, M. Miettinen, and A.-R. Sadeghi, “Federated learning for intrusion detection system: Concepts, challenges and future directions,”Computer Networks, vol. 197, p. 108270, 2022

  6. [6]

    A comprehensive survey of privacy-preserving federated learning: A taxonomy, review, and future directions,

    X. Yin, Y . Zhu, and J. Hu, “A comprehensive survey of privacy-preserving federated learning: A taxonomy, review, and future directions,”ACM Computing Surveys, vol. 54, no. 6, pp. 1–36, 2023

  7. [7]

    Fed- erated learning for 5g: A survey,

    Y . Siriwardhana, P. Porambage, M. Liyanage, and M. Ylianttila, “Fed- erated learning for 5g: A survey,”IEEE Communications Surveys & Tutorials, vol. 23, no. 3, pp. 1935–1962, 2021

  8. [8]

    Edge intelligence: Paving the last mile of artificial intelligence with edge computing,

    Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, “Edge intelligence: Paving the last mile of artificial intelligence with edge computing,”Proceedings of the IEEE, vol. 107, no. 8, pp. 1738–1762, 2019, eXPERIMENTS: Edge computing evaluation framework

  9. [9]

    Federated learning with compression: Unified analysis and sharp guarantees,

    F. Haddadpour, M. M. Kamani, A. Mokhtari, and M. Mahdavi, “Federated learning with compression: Unified analysis and sharp guarantees,” in International Conference on Artificial Intelligence and Statistics, 2021, pp. 2350–2358

  10. [10]

    Federated learning in mobile edge networks: A comprehensive survey,

    W. Y . B. Lim, N. C. Luong, D. T. Hoang, Y . Jiao, Y .-C. Liang, Q. Yang, D. Niyato, and C. Miao, “Federated learning in mobile edge networks: A comprehensive survey,”IEEE Communications Surveys & Tutorials, vol. 22, no. 3, pp. 2031–2063, 2020

  11. [11]

    Privacy-preserving collaborative learning via automatic differential privacy budget allocation,

    Z. Chen, K. Zhang, M. Lu, Q. Zhu, and X. Zhang, “Privacy-preserving collaborative learning via automatic differential privacy budget allocation,” IEEE Transactions on Dependable and Secure Computing, vol. 21, no. 3, pp. 1456–1470, 2024

  12. [12]

    signsgd: Compressed optimisation for non-convex problems,

    J. Bernstein, Y .-X. Wang, K. Azizzadenesheli, and A. Anandkumar, “signsgd: Compressed optimisation for non-convex problems,” inInter- national Conference on Machine Learning, 2018, pp. 560–569

  13. [13]

    Convergence of edge computing and deep learning: A comprehensive survey,

    J. Chen and X. Ran, “Convergence of edge computing and deep learning: A comprehensive survey,”IEEE Communications Surveys & Tutorials, vol. 22, no. 2, pp. 869–904, 2020, eXPERIMENTS: Edge computing benchmarks

  14. [14]

    Federated semi- supervised learning for attack detection in industrial internet of things,

    O. Aouedi, K. Piamrat, G. Muller, and K. Singh, “Federated semi- supervised learning for attack detection in industrial internet of things,” IEEE Transactions on Industrial Informatics, vol. 18, no. 5, pp. 3443– 3452, 2022

  15. [15]

    Federated learning with differential privacy: Algorithms and performance analysis,

    K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. Quek, and H. V . Poor, “Federated learning with differential privacy: Algorithms and performance analysis,”IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3454–3469, 2020

  16. [16]

    Deep learning with differential privacy,

    M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” inProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 308–318

  17. [17]

    Privacy-preserving federated learning based on multi-key homomorphic encryption,

    C. Ma, J. Li, M. Ding, B. Liu, K. Wei, J. Weng, and H. V . Poor, “Privacy-preserving federated learning based on multi-key homomorphic encryption,”International Journal of Intelligent Systems, vol. 37, no. 9, pp. 5880–5901, 2022

  18. [18]

    A hybrid approach to privacy-preserving federated learning,

    S. Truex, N. Baracaldo, A. Anwar, T. Steinke, H. Ludwig, R. Zhang, and Y . Zhou, “A hybrid approach to privacy-preserving federated learning,” inProceedings of the 12th ACM Workshop on Artificial Intelligence and Security, 2019, pp. 1–11

  19. [19]

    Secure and efficient federated learning via novel multi-key homomorphic encryption,

    W. Zhang, Y . Liu, T. Chen, and Q. Yang, “Secure and efficient federated learning via novel multi-key homomorphic encryption,” inUSENIX Security Symposium, 2024, pp. 3421–3438

  20. [20]

    Practical secure aggregation for privacy-preserving machine learning,

    K. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” inProceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 1175–1191

  21. [21]

    Internet of things intrusion detection: Centralized, on-device, or federated learning?

    S. Rahman, I. Khalil, and M. Atiquzzaman, “Internet of things intrusion detection: Centralized, on-device, or federated learning?”IEEE Network, vol. 34, no. 6, pp. 310–317, 2020

  22. [22]

    Secure single-server aggregation with (poly) logarithmic overhead,

    J. H. Bell, K. A. Bonawitz, A. Ga ˇsc´on, T. Lepoint, and M. Raykova, “Secure single-server aggregation with (poly) logarithmic overhead,” in Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, 2020, pp. 1253–1269

  23. [23]

    Fastsecagg: Scalable secure aggregation for privacy-preserving federated learning,

    S. Kadhe, N. Rajaraman, O. O. Koyluoglu, and K. Ramchandran, “Fastsecagg: Scalable secure aggregation for privacy-preserving federated learning,” inWorkshop on Federated Learning for User Privacy and Data Confidentiality, 2020

  24. [24]

    Batchcrypt: Efficient homomorphic encryption for cross-silo federated learning,

    C. Zhang, S. Li, J. Xia, W. Wang, F. Yan, and Y . Liu, “Batchcrypt: Efficient homomorphic encryption for cross-silo federated learning,” in Proceedings of the 2020 USENIX Annual Technical Conference, 2020, pp. 493–506

  25. [25]

    Qsgd: Communication-efficient sgd via gradient quantization and encoding,

    D. Alistarh, D. Grubic, J. Li, R. Tomioka, and M. V ojnovic, “Qsgd: Communication-efficient sgd via gradient quantization and encoding,” in Advances in Neural Information Processing Systems, vol. 30, 2017, pp. 1709–1720

  26. [26]

    Lightsecagg: A lightweight and versatile design for secure aggregation in federated learning,

    J. Fang, Y . Wang, Y . Xu, and Q. Zhou, “Lightsecagg: A lightweight and versatile design for secure aggregation in federated learning,”Proceedings of Machine Learning and Systems, vol. 4, pp. 694–720, 2022

  27. [27]

    Turbo- aggregate: Breaking the quadratic aggregation barrier in secure federated learning,

    J. So, B. G ¨urel, A. S. Amiri, B. Guler, and A. S. Avestimehr, “Turbo- aggregate: Breaking the quadratic aggregation barrier in secure federated learning,”IEEE Journal on Selected Areas in Information Theory, vol. 2, no. 1, pp. 479–489, 2021

  28. [28]

    Deep gradient compression: Reducing the communication bandwidth for distributed training,

    Y . Lin, S. Han, H. Mao, Y . Wang, and W. J. Dally, “Deep gradient compression: Reducing the communication bandwidth for distributed training,” inInternational Conference on Learning Representations, 2018

  29. [29]

    Fedpaq: A communication-efficient federated learning method with periodic averaging and quantization,

    A. Reisizadeh, A. Mokhtari, H. Hassani, A. Jadbabaie, and R. Pedarsani, “Fedpaq: A communication-efficient federated learning method with periodic averaging and quantization,” inInternational Conference on Artificial Intelligence and Statistics, 2020, pp. 2021–2031

  30. [30]

    Robust and communication-efficient federated learning from non-iid data,

    F. Sattler, S. Wiedemann, K.-R. M ¨uller, and W. Samek, “Robust and communication-efficient federated learning from non-iid data,”IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 9, pp. 3400–3413, 2019. EDGEDETECT: SECURE GRADIENT COMPRESSION FOR FEDERATED INTRUSION DETECTION 16

  31. [31]

    Efficient privacy-preserving federated learning with gradient compression,

    X. Ma, L. Sun, and Y . Yao, “Efficient privacy-preserving federated learning with gradient compression,”IEEE Transactions on Information Forensics and Security, vol. 19, pp. 1123–1138, 2024

  32. [32]

    Terngrad: Ternary gradients to reduce communication in distributed deep learning,

    W. Wen, C. Xu, F. Yan, C. Wu, Y . Wang, Y . Chen, and H. Li, “Terngrad: Ternary gradients to reduce communication in distributed deep learning,” inAdvances in Neural Information Processing Systems, vol. 30, 2017, pp. 2055–2065

  33. [33]

    Distributed security framework for reliable threat intel- ligence sharing in federated deep learning,

    D. Preuveneers, V . Rimmer, I. Tsingenopoulos, J. Spooren, W. Joosen, and E. Ilie-Zudor, “Distributed security framework for reliable threat intel- ligence sharing in federated deep learning,”Security and Communication Networks, vol. 2018, p. Article ID 6060253, 2018

  34. [34]

    Intelligent intrusion detection based on federated learning aided long short-term memory,

    R. Zhao, Y . Wang, Z. Xue, T. Ohtsuki, B. Mao, N. Zhang, and H. Jiang, “Intelligent intrusion detection based on federated learning aided long short-term memory,”Physical Communication, vol. 42, p. 101157, 2020

  35. [35]

    Towards federated learning at scale: System design,

    K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V . Ivanov, C. Kiddon, J. Kone ˇcn`y, S. Mazzocchi, B. McMahanet al., “Towards federated learning at scale: System design,”Proceedings of Machine Learning and Systems, vol. 1, pp. 374–388, 2019

  36. [36]

    Federated learning on non-iid data silos: An experimental study,

    Q. Li, Y . Diao, Q. Chen, and B. He, “Federated learning on non-iid data silos: An experimental study,” in2022 IEEE 38th International Conference on Data Engineering, 2020, pp. 965–978

  37. [37]

    Secureboost: A lossless federated learning framework,

    K. Cheng, T. Fan, Y . Jin, Y . Liu, T. Chen, D. Papadopoulos, and Q. Yang, “Secureboost: A lossless federated learning framework,”IEEE Intelligent Systems, vol. 36, no. 6, pp. 87–98, 2021

  38. [38]

    Safelearn: Secure aggregation for private federated learning,

    H. Fereidooni, S. Marchal, M. Miettinen, A. Mirhoseini, H. Mollering, T. D. Nguyen, P. Rieger, A.-R. Sadeghi, T. Schneider, H. Yalameet al., “Safelearn: Secure aggregation for private federated learning,” in2021 IEEE Security and Privacy Workshops, 2021, pp. 56–62

  39. [39]

    Doublesqueeze: Parallel stochastic gradient descent with double-pass error-compensated compression,

    H. Tang, S. Gan, C. Zhang, T. Zhang, and J. Liu, “Doublesqueeze: Parallel stochastic gradient descent with double-pass error-compensated compression,” inInternational Conference on Machine Learning, 2019, pp. 6155–6165

  40. [40]

    Fetchsgd: Communication-efficient federated learning with sketching,

    D. Rothchild, A. Panda, E. Ullah, N. Ivkin, I. Stoica, V . Braverman, J. Gonzalez, and R. Arora, “Fetchsgd: Communication-efficient federated learning with sketching,” inInternational Conference on Machine Learning, 2020, pp. 8253–8265

  41. [41]

    Adaptive gradient spar- sification for efficient federated learning: An online learning approach,

    P. Lu, Y . Wang, S. Li, H. Song, and D. Wang, “Adaptive gradient spar- sification for efficient federated learning: An online learning approach,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 12, pp. 5469–5481, 2020

  42. [42]

    Qsparse-local-sgd: Distributed sgd with quantization, sparsification, and local computations,

    D. Basu, D. Data, C. Karakus, and S. Diggavi, “Qsparse-local-sgd: Distributed sgd with quantization, sparsification, and local computations,” IEEE Journal on Selected Areas in Information Theory, vol. 1, no. 1, pp. 217–226, 2020

  43. [43]

    arXiv preprint arXiv:1901.09269 , title =

    K. Mishchenko, E. Gorbunov, M. Takac, and P. Richt ´arik, “Dis- tributed learning with compressed gradient differences,”arXiv preprint arXiv:1901.09269, 2019

  44. [44]

    Decentralized deep learning with arbitrary communication compression,

    A. Koloskova, S. U. Stich, and M. Jaggi, “Decentralized deep learning with arbitrary communication compression,” inInternational Conference on Learning Representations, 2019

  45. [45]

    Natural compression for distributed deep learning,

    S. Horv´ath, C.-Y . Ho, L. Horvath, A. N. Sahu, M. Canini, and P. Richt´arik, “Natural compression for distributed deep learning,”Mathematical and Scientific Machine Learning, pp. 129–141, 2022

  46. [46]

    Communication-efficient distributed sgd with compressed sensing,

    H. Tang, X. Lian, T. Zhang, and J. Liu, “Communication-efficient distributed sgd with compressed sensing,” inInternational Conference on Machine Learning, 2021, pp. 10 259–10 269

  47. [47]

    Proof-of-learning: Definitions and practice,

    H. Jia, M. Yaghini, C. A. Choquette-Choo, N. Dullerud, A. Thudi, V . Chandrasekaran, and N. Papernot, “Proof-of-learning: Definitions and practice,” in2021 IEEE Symposium on Security and Privacy, 2021, pp. 1039–1056

  48. [48]

    Scaffold: Stochastic controlled averaging for federated learning,

    S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold: Stochastic controlled averaging for federated learning,” inInternational Conference on Machine Learning, 2020, pp. 5132–5143

  49. [49]

    Hybridalpha: An efficient approach for privacy-preserving federated learning,

    R. Xu, N. Baracaldo, Y . Zhou, A. Anwar, and H. Ludwig, “Hybridalpha: An efficient approach for privacy-preserving federated learning,” pp. 13–23, 2019

  50. [50]

    Communication-efficient federated learning for wireless edge intelligence in iot,

    J. Mills, J. Hu, and G. Min, “Communication-efficient federated learning for wireless edge intelligence in iot,”IEEE Internet of Things Journal, vol. 7, no. 7, pp. 5986–5994, 2019

  51. [51]

    Federated learning over wireless networks: Optimization model design and analysis,

    N. H. Tran, W. Bao, A. Zomaya, M. N. Nguyen, and C. S. Hong, “Federated learning over wireless networks: Optimization model design and analysis,” inIEEE INFOCOM 2019-IEEE Conference on Computer Communications, 2019, pp. 1387–1395

  52. [52]

    Adaptive scheduling for federated learning on resource-constrained edge devices,

    Y . Deng, F. Lyu, J. Ren, H. Wu, Y . Zhou, Y . Zhang, and Y . Yang, “Adaptive scheduling for federated learning on resource-constrained edge devices,”IEEE Internet of Things Journal, vol. 7, no. 8, pp. 7942–7953, 2020

  53. [53]

    Optimizing federated learning on non-iid data with reinforcement learning,

    H. Wang, M. Yurochkin, Y . Sun, D. Papailiopoulos, and Y . Khazaeni, “Optimizing federated learning on non-iid data with reinforcement learning,” inIEEE INFOCOM 2020-IEEE Conference on Computer Communications, 2020, pp. 1698–1707

  54. [54]

    Federated learning on non-iid data: A survey,

    H. Zhu, J. Xu, S. Liu, and Y . Jin, “Federated learning on non-iid data: A survey,”Neurocomputing, vol. 465, pp. 371–390, 2021

  55. [55]

    Adaptive gradient quantization for privacy-preserving federated learning in iot networks,

    D. Wu, F. Wang, Y . Cao, and J. Li, “Adaptive gradient quantization for privacy-preserving federated learning in iot networks,”IEEE Internet of Things Journal, vol. 11, no. 8, pp. 13 245–13 258, 2024

  56. [56]

    Intrusion detection using cnn-based image representation of network traffic,

    M. S. Alam, M. R. Karim, and M. J. Hossain, “Intrusion detection using cnn-based image representation of network traffic,”IEEE Access, vol. 11, pp. 24 312–24 325, 2023

  57. [57]

    Explainable xgboost-based intrusion detection for network security,

    N. Ghani, I. Ahmad, and M. K. Khan, “Explainable xgboost-based intrusion detection for network security,”Computers & Security, vol. 123, p. 102947, 2023

  58. [58]

    Lstm autoencoder-based network intrusion detection,

    D. Savi ´c and M. Radovanovi ´c, “Lstm autoencoder-based network intrusion detection,”Journal of Network and Computer Applications, vol. 173, p. 102890, 2021

  59. [59]

    Anomaly detection using isolation forest for network intrusion detection,

    G. Cerar and T. Zagar, “Anomaly detection using isolation forest for network intrusion detection,”Applied Sciences, vol. 10, no. 18, p. 6405, 2020

  60. [60]

    Adaptive federated learning in resource-constrained edge computing systems,

    S. Wang, T. Tuor, and T. Salonidis, “Adaptive federated learning in resource-constrained edge computing systems,”IEEE Journal on Selected Areas in Communications, vol. 40, no. 1, pp. 280–294, 2022

  61. [61]

    Federated learning with lstm for network intrusion detection,

    C. Zhang, Y . Xie, and B. Li, “Federated learning with lstm for network intrusion detection,”IEEE Internet of Things Journal, vol. 9, no. 16, pp. 14 641–14 653, 2022

  62. [62]

    Secure federated xgboost learning for iot intrusion detection,

    M. Chen, W. Saad, and H. V . Poor, “Secure federated xgboost learning for iot intrusion detection,”IEEE Transactions on Information Forensics and Security, vol. 16, pp. 3674–3689, 2021. APPENDIX This appendix provides additional theoretical clarification of the proposed Gradient Smartification mechanism, its conver- gence properties relative to signSGD, a...

  63. [63]

    By standard smoothness inequality, f(w t+1)≤f(w t) +⟨∇f(w t), wt+1 −w t⟩+ L 2 ∥wt+1 −w t∥2 2

    Proof Sketch of Theorem 1:We assume f is L-smooth. By standard smoothness inequality, f(w t+1)≤f(w t) +⟨∇f(w t), wt+1 −w t⟩+ L 2 ∥wt+1 −w t∥2 2. Substituting the update rulew t+1 =w t −η˜gt gives f(w t+1)≤f(w t)−η⟨∇f(w t),˜gt⟩+ Lη2 2 ∥˜gt∥2 2. Taking expectation and applying Proposition 1, E ⟨∇f(w t),˜gt⟩ ≥γ∥∇f(w t)∥2 2. Thus, E[f(w t+1)]≤f(w t)−ηγ∥∇f(w t...

  64. [64]

    Performance degrades smoothly as it α decreases, with minority/overlapping classes most affected

    Non-IID Data Distribution Analysis:Table XXI reports per-class F1 under increasing heterogeneity. Performance degrades smoothly as it α decreases, with minority/overlapping classes most affected. TABLE XXI: Per-Class F1-Scores Under Data Heterogeneity Attack Class IIDα=10α=1.0α=0.5α=0.1 Label Skew BENIGN 0.989 0.987 0.983 0.978 0.971 0.984 DoS 0.989 0.988...