pith. sign in

arxiv: 2604.06101 · v1 · submitted 2026-04-07 · 💻 cs.CR

Towards Securing IIoT: An Innovative Privacy-Preserving Anomaly Detector Based on Federated Learning

Pith reviewed 2026-05-10 19:06 UTC · model grok-4.3

classification 💻 cs.CR
keywords federated learninganomaly detectionIIoTprivacy preservationhomomorphic encryptioncybersecurityindustrial internet of thingsstraggler mitigation
0
0 comments X

The pith

A federated learning framework with homomorphic encryption and dynamic agent selection detects IIoT anomalies while preserving privacy and cutting communication costs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a federated learning system for spotting cyberattacks and other anomalies in industrial IoT settings. Data stays local on each agent, models train collaboratively, and homomorphic encryption shields the shared parameters from inference attacks. A dynamic selection rule picks which agents join each round based on their response delays and data volumes to avoid stragglers and uneven workloads. The approach is shown to reach higher accuracy, precision, and F1 scores than standard synchronous or asynchronous federated learning while lowering total communication volume and speeding convergence. If the results hold, industrial operators could monitor equipment and networks without exposing raw production data.

Core claim

The paper proposes an anomaly detection system built on a novel federated learning framework that combines homomorphic encryption to protect model updates with a dynamic agent selection scheme that computes a participation threshold from each agent's delay and data size; this combination mitigates straggler effects and communication bottlenecks while delivering higher accuracy, precision, F1-scores, lower communication costs, faster convergence, and improved fairness compared with baseline federated learning architectures.

What carries the argument

The dynamic agent selection scheme, which sets a participation threshold using measured delays and local data sizes to choose training participants each round and thereby reduce straggler impact without raw data exchange.

Load-bearing premise

That choosing agents by their delays and data sizes avoids selection bias and still produces a model of equal or better quality than using all agents or fixed schedules.

What would settle it

A controlled experiment in which the dynamic selection scheme produces lower final accuracy, slower convergence, or reduced fairness metrics than either synchronous or asynchronous baselines on the same IIoT datasets would falsify the performance claims.

Figures

Figures reproduced from arXiv: 2604.06101 by Chafika Benza\"id, Samira Kamali Poorazad, Tarik Taleb.

Figure 1
Figure 1. Figure 1: High-level architecture of the DyHFL framework. [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: DyHFL framework Flowchart. Algorithm 2: Weighted Average Metrics (WAM) Input: Global metrics for each Agent ← Global MT1 , . . . , Global MTN Output: Short-term threshold for each preliminary round 1 Sort Global MT1 , Global MT2 , ..., Global MT N in descending order 2 for each i ← 1 to n do 3 weight(i) ← 1 Global MT i 4 # Reverse weight = [W1, W2, . . . , Wn] 5 weight ← [WN , WN−1, . . . , W1] 6 W Avg ← P… view at source ↗
Figure 3
Figure 3. Figure 3: Convergence Speed Comparison Across FL Methods based on 100 Agents [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of communication cost (exchanged [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Model performance comparison of six FL algorithms (SyncFL, AsyncFL, FedBuff, BFL, ASR [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
read the original abstract

In the light of the growing connectivity and sensitivity of industrial data, cyberattacks and data breaches are becoming more common in the Industrial Internet of Things (IIoT). To cope with such threats, this study presents an anomaly detection system based on a novel Federated Learning (FL) framework. This system detects anomalies such as cyberattacks and protects industrial data privacy by processing data locally and training anomaly detection models on industrial agents without sharing raw data. The proposed FL framework incorporates two key components to enhance both privacy and efficiency. The first component is Homomorphic Encryption (HE), which is integrated into the framework to further protect sensitive data transmissions such as model parameters. HE enhances privacy in FL by preventing adversaries from inferring private industrial data through attacks, such as model inversion attacks. The second component is an innovative dynamic agent selection scheme, wherein a selection threshold is calculated based on agent delays and data size. The purpose of this new scheme is to mitigate the straggler effect and the communication bottleneck that occur in traditional FL architectures, such as synchronous and asynchronous architectures. It ensures that agents are not unfairly selected by the different delays resulting from heterogeneous data in IIoT environments, while simultaneously improving model performance and convergence speed. The proposed framework exhibits superior performance over baseline approaches in terms of accuracy, precision, F1-scores, communication costs, convergence speeds, and fairness rate.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a federated learning framework for anomaly detection in IIoT environments that integrates homomorphic encryption to protect model parameter transmissions and a dynamic agent selection scheme based on agent delays and data sizes. The selection scheme aims to reduce straggler effects and communication bottlenecks while maintaining fairness. The paper claims that this framework achieves superior performance compared to synchronous and asynchronous FL baselines across accuracy, precision, F1-scores, communication costs, convergence speed, and fairness rate.

Significance. If the performance and fairness claims hold under rigorous validation, the work could offer a practical advance in privacy-preserving anomaly detection for heterogeneous IIoT systems by combining HE with adaptive client selection to address both security and efficiency challenges.

major comments (2)
  1. [Abstract] Abstract: The manuscript asserts superior performance over baselines in accuracy, precision, F1-scores, communication costs, convergence speeds, and fairness rate, yet supplies no quantitative results, tables, experimental setup details, or dataset descriptions to support these claims. This absence is load-bearing because the central contribution rests on demonstrating these improvements.
  2. [Dynamic agent selection scheme] Dynamic agent selection scheme (as described): The threshold calculation based on delays and data size is presented as mitigating stragglers without introducing selection bias or degrading model quality. However, no mathematical formulation of the threshold, derivation of unbiasedness for the resulting FedAvg updates, or ablation (e.g., data-size histograms of selected vs. all agents) is provided. This is critical, as bias toward faster/smaller-data agents would undermine the fairness rate and accuracy claims in heterogeneous IIoT settings.
minor comments (1)
  1. [Abstract] Abstract: The description of the anomaly types (e.g., specific cyberattacks) and the IIoT datasets used could be added to give context for the performance claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will incorporate the suggested clarifications and additions in the revised version.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The manuscript asserts superior performance over baselines in accuracy, precision, F1-scores, communication costs, convergence speeds, and fairness rate, yet supplies no quantitative results, tables, experimental setup details, or dataset descriptions to support these claims. This absence is load-bearing because the central contribution rests on demonstrating these improvements.

    Authors: We agree that the abstract would be strengthened by including key quantitative highlights. The manuscript body (Section 3 for dataset and experimental setup, Section 4 for results) contains the supporting tables and metrics; we will revise the abstract to reference specific improvements (e.g., accuracy, F1, and communication-cost reductions) while preserving its length constraints. revision: yes

  2. Referee: [Dynamic agent selection scheme] Dynamic agent selection scheme (as described): The threshold calculation based on delays and data size is presented as mitigating stragglers without introducing selection bias or degrading model quality. However, no mathematical formulation of the threshold, derivation of unbiasedness for the resulting FedAvg updates, or ablation (e.g., data-size histograms of selected vs. all agents) is provided. This is critical, as bias toward faster/smaller-data agents would undermine the fairness rate and accuracy claims in heterogeneous IIoT settings.

    Authors: We will add the explicit mathematical definition of the selection threshold (based on normalized delay and data-size terms) to Section 2.3, include a short derivation establishing that the resulting weighted FedAvg estimator remains unbiased under standard assumptions on client participation, and append ablation figures (data-size histograms and fairness-rate curves) comparing selected versus full agent pools. These revisions will directly substantiate the fairness and convergence claims. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain; empirical claims lack mathematical reductions

full rationale

The manuscript presents a descriptive FL framework combining HE for privacy and a dynamic agent selection heuristic based on delays and data size, with superiority asserted via empirical metrics (accuracy, F1, fairness, convergence). No equations, derivations, or parameter-fitting procedures are exhibited that could reduce predictions to inputs by construction. The selection scheme is introduced as an innovation without self-citation load-bearing, uniqueness theorems, or ansatzes smuggled from prior work. Performance claims rest on comparisons to baselines rather than deductive steps, rendering the chain self-contained and non-circular under the defined criteria.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework assumes standard properties of federated learning and homomorphic encryption, with the main addition being the selection scheme whose parameters are not specified.

free parameters (1)
  • selection threshold
    The threshold for agent selection is calculated based on delays and data size, likely requiring tuning or fitting to specific IIoT environments.
axioms (1)
  • domain assumption Heterogeneous data and delays in IIoT agents can be mitigated by dynamic selection without affecting model convergence negatively.
    Invoked in the description of the dynamic agent selection scheme.

pith-pipeline@v0.9.0 · 5561 in / 1360 out tokens · 89720 ms · 2026-05-10T19:06:18.658412+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

  1. [1]

    Blockchain and deep learning-based ids for securing sdn-enabled industrial iot environments,

    S. K. Poorazad, C. Benza ¨ıd, and T. Taleb, “Blockchain and deep learning-based ids for securing sdn-enabled industrial iot environments,” 19 inProc. of IEEE Globecom’23, Kuala Lumpur, Malaysia, 2023, pp. 2760–2765

  2. [2]

    A scalable and hybrid intrusion detection system based on the convolutional-lstm network,

    M. A. Khan, M. R. Karim, and Y . Kim, “A scalable and hybrid intrusion detection system based on the convolutional-lstm network,”Symmetry, vol. 11, no. 4, p. 583, 2019

  3. [4]

    Detecting cyberattacks using anomaly detection in industrial control systems: A federated learning approach,

    T. T. Huong, T. P. Bac, D. M. Long, T. D. Luong, N. M. Dan, B. D. Thang, K. P. Tranet al., “Detecting cyberattacks using anomaly detection in industrial control systems: A federated learning approach,” Computers in Industry, vol. 132, p. 103509, 2021

  4. [5]

    Communication-efficient federated learning for anomaly detection in industrial internet of things,

    Y . Liu, N. Kumar, Z. Xiong, W. Y . B. Lim, J. Kang, and D. Niyato, “Communication-efficient federated learning for anomaly detection in industrial internet of things,” inProc. GLOBECOM 2020-2020 IEEE Global Communications Conference, Taipei, Taiwan, 2021

  5. [6]

    A federated learning approach to anomaly detection in smart buildings,

    R. A. Sater and A. B. Hamza, “A federated learning approach to anomaly detection in smart buildings,”ACM Transactions on Internet of Things, vol. 2, no. 4, pp. 1–23, 2021

  6. [7]

    D¨Iot: A federated self-learning anomaly detection sys- tem for iot,

    T. D. Nguyen, S. Marchal, M. Miettinen, H. Fereidooni, N. Asokan, and A.-R. Sadeghi, “D¨Iot: A federated self-learning anomaly detection sys- tem for iot,” inProc. IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Dallas, TX, USA, Oct. 2019

  7. [8]

    Federated deep reinforcement learning for prediction-based network slice mobility in 6g mobile networks,

    Z. Ming, H. Yu, and T. Taleb, “Federated deep reinforcement learning for prediction-based network slice mobility in 6g mobile networks,”IEEE Transactions on Mobile Computing, vol. 23, no. 12, pp. 11 937–11 953, 2024

  8. [9]

    Ai for beyond 5g networks: a cyber-security defense or offense enabler?

    C. Benza ¨ıd and T. Taleb, “Ai for beyond 5g networks: a cyber-security defense or offense enabler?”IEEE network, vol. 34, no. 6, pp. 140–147, 2020

  9. [10]

    AI/ML for Beyond 5G Systems: Concepts, Technology Enablers & Solutions,

    T. Taleb, C. Benza ¨ıd, R. Addad, and K. Samdanis, “AI/ML for Beyond 5G Systems: Concepts, Technology Enablers & Solutions,”Elsevier Journal on Computer Networks, p. 110044, 2023

  10. [11]

    Fusion of federated learning and industrial internet of things: A survey,

    P. Boobalan, S. P. Ramu, Q.-V . Pham, K. Dev, S. Pandya, P. K. R. Maddikunta, T. R. Gadekallu, and T. Huynh-The, “Fusion of federated learning and industrial internet of things: A survey,”Computer Networks, vol. 212, p. 109048, 2022

  11. [12]

    Communication-Efficient Learning of Deep Networks from Decentralized Data,

    B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” inProc. 20th International Conference on Arti- ficial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, A. Singh and J. Zhu, Eds., Florida, USA, 2017

  12. [13]

    Asynchronous federated optimization,

    C. Xie, S. Koyejo, and I. Gupta, “Asynchronous federated optimization,” arXiv preprint arXiv:1903.03934, 2020

  13. [14]

    Federated learning with buffered asynchronous aggregation,

    J. Nguyen, K. Malik, H. Zhan, A. Yousefpour, M. Rabbat, M. Malek, and D. Huba, “Federated learning with buffered asynchronous aggregation,” inProc. 25th International Conference on Artificial Intelligence and Statistics, Valencia, Spain, 2022

  14. [15]

    A novel buffered federated learning framework for privacy-driven anomaly detection in iiot,

    S. K. Poorazad, C. Benza ¨ıd, and T. Taleb, “A novel buffered federated learning framework for privacy-driven anomaly detection in iiot,” in Proc. of IEEE Globecom’24, Cape Town, South Africa, 2024

  15. [16]

    An ensemble deep federated learning cyber-threat hunting model for industrial internet of things,

    A. N. Jahromi, H. Karimipour, and A. Dehghantanha, “An ensemble deep federated learning cyber-threat hunting model for industrial internet of things,”Computer Communications, vol. 198, pp. 108–116, 2023

  16. [17]

    Distributed anomaly detection in smart grids: A federated learning-based approach,

    J. Jithish, B. Alangot, N. Mahalingam, and K. S. Yeo, “Distributed anomaly detection in smart grids: A federated learning-based approach,” IEEE Access, vol. 11, pp. 7157–7179, 2023

  17. [18]

    Federated deep learning for anomaly detection in the internet of things,

    X. Wang, Y . Wang, Z. Javaheri, L. Almutairi, N. Moghadamnejad, and O. S. Younes, “Federated deep learning for anomaly detection in the internet of things,”Computers and Electrical Engineering, vol. 108, p. 108651, 2023

  18. [19]

    Adaptive federated learning and digital twin for industrial internet of things,

    W. Sun, S. Lei, L. Wang, Z. Liu, and Y . Zhang, “Adaptive federated learning and digital twin for industrial internet of things,”IEEE Trans- actions on Industrial Informatics, vol. 17, no. 8, pp. 5605–5614, 2021

  19. [20]

    Towards asynchronous federated learning for heterogeneous edge-powered internet of things,

    Z. Chen, W. Liao, K. Hua, C. Lu, and W. Yu, “Towards asynchronous federated learning for heterogeneous edge-powered internet of things,” Digital Communications and Networks, vol. 7, no. 3, pp. 317–326, 2021

  20. [21]

    Csafl: A clustered semi-asynchronous federated learning framework,

    Y . Zhang, M. Duan, D. Liu, L. Li, A. Ren, X. Chen, Y . Tan, and C. Wang, “Csafl: A clustered semi-asynchronous federated learning framework,” inProc. International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021

  21. [22]

    Safa: A semi- asynchronous protocol for fast federated learning with low overhead,

    W. Wu, L. He, W. Lin, R. Mao, C. Maple, and S. Jarvis, “Safa: A semi- asynchronous protocol for fast federated learning with low overhead,” IEEE Transactions on Computers, vol. 70, no. 5, pp. 655–668, 2021

  22. [23]

    An asynchronous federated learning arbitration model for low-rate ddos attack detection,

    Z. Liu, C. Guo, D. Liu, and X. Yin, “An asynchronous federated learning arbitration model for low-rate ddos attack detection,”IEEE Access, vol. 11, pp. 18 448–18 460, 2023

  23. [24]

    Delay and energy-efficient asynchronous federated learning for intrusion detection in heterogeneous industrial internet of things,

    S. Liu, Y . Yu, Y . Zong, P. L. Yeoh, L. Guo, B. Vucetic, T. Q. Duong, and Y . Li, “Delay and energy-efficient asynchronous federated learning for intrusion detection in heterogeneous industrial internet of things,”IEEE Internet of Things Journal, vol. 11, no. 8, pp. 14 739–14 754, 2024

  24. [25]

    Asr- fed: agnostic straggler-resilient semi-asynchronous federated learning technique for secured drone network,

    V . U. Ihekoronye, C. I. Nwakanma, D.-S. Kim, and J. M. Lee, “Asr- fed: agnostic straggler-resilient semi-asynchronous federated learning technique for secured drone network,”International Journal of Machine Learning and Cybernetics, vol. 15, no. 11, pp. 5303–5319, 2024

  25. [26]

    Exploiting unintended feature leakage in collaborative learning,

    L. Melis, C. Song, E. De Cristofaro, and V . Shmatikov, “Exploiting unintended feature leakage in collaborative learning,” in2019 IEEE Symposium on Security and Privacy (SP), 2019, pp. 691–706

  26. [27]

    Security and privacy-enhanced federated learning for anomaly detection in iot infrastructures,

    L. Cui, Y . Qu, G. Xie, D. Zeng, R. Li, S. Shen, and S. Yu, “Security and privacy-enhanced federated learning for anomaly detection in iot infrastructures,”IEEE Transactions on Industrial Informatics, vol. 18, no. 5, pp. 3492–3500, 2021

  27. [28]

    Homomorphic encryption and federated learning based privacy-preserving cnn training: Covid-19 detection use-case,

    F. Wibawa, F. O. Catak, M. Kuzlu, S. Sarp, and U. Cali, “Homomorphic encryption and federated learning based privacy-preserving cnn training: Covid-19 detection use-case,” inProc. European Interdisciplinary Cy- bersecurity Conference, Barcelona, Spain, 2022

  28. [29]

    Privacy-preserving federated learning based on multi-key homomorphic encryption,

    J. Ma, S.-A. Naas, S. Sigg, and X. Lyu, “Privacy-preserving federated learning based on multi-key homomorphic encryption,”International Journal of Intelligent Systems, vol. 37, no. 9, pp. 5880–5901, 2022

  29. [30]

    Privacy preserving machine learning with ho- momorphic encryption and federated learning,

    H. Fang and Q. Qian, “Privacy preserving machine learning with ho- momorphic encryption and federated learning,”Future Internet, vol. 13, no. 4, 2021

  30. [31]

    A secure federated learning framework using homomor- phic encryption and verifiable computing,

    A. Madi, O. Stan, A. Mayoue, A. Grivet-S ´ebert, C. Gouy-Pailler, and R. Sirdey, “A secure federated learning framework using homomor- phic encryption and verifiable computing,” in2021 Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), 2021, pp. 1–8

  31. [32]

    Poster: A reliable and accountable privacy-preserving federated learning framework using the blockchain,

    S. Awan, F. Li, B. Luo, and M. Liu, “Poster: A reliable and accountable privacy-preserving federated learning framework using the blockchain,” inProc. ACM SIGSAC conference on computer and communications security, London, United Kingdom, 2019

  32. [33]

    Deepfed: Federated deep learning for intrusion detection in industrial cyber–physical sys- tems,

    B. Li, Y . Wu, J. Song, R. Lu, T. Li, and L. Zhao, “Deepfed: Federated deep learning for intrusion detection in industrial cyber–physical sys- tems,”IEEE Transactions on Industrial Informatics, vol. 17, no. 8, pp. 5615–5624, 2020

  33. [34]

    Partially encrypted multi- party computation for federated learning,

    E. Sotthiwat, L. Zhen, Z. Li, and C. Zhang, “Partially encrypted multi- party computation for federated learning,” in2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2021, pp. 828–835

  34. [35]

    Augmented multi-party computation against gradient leakage in federated learning,

    C. Zhang, S. Ekanut, L. Zhen, and Z. Li, “Augmented multi-party computation against gradient leakage in federated learning,”IEEE Transactions on Big Data, pp. 1–10, 2022

  35. [36]

    Two-level privacy-preserving framework: Federated learning for attack detection in the consumer internet of things,

    E. Rabieinejad, A. Yazdinejad, A. Dehghantanha, and G. Srivastava, “Two-level privacy-preserving framework: Federated learning for attack detection in the consumer internet of things,”IEEE Transactions on Consumer Electronics, vol. 70, no. 1, pp. 4258–4265, 2024

  36. [37]

    Pepfl: A framework for a practical and efficient privacy-preserving federated learning,

    Y . Chen, B. Wang, H. Jiang, P. Duan, Y . Ping, and Z. Hong, “Pepfl: A framework for a practical and efficient privacy-preserving federated learning,”Digital Communications and Networks, vol. 10, no. 2, pp. 355–368, 2024

  37. [38]

    Industrial control system (ics) cyber attack datasets: Gas pipeline and water storage tank,

    T. Morris and W. Gao, “Industrial control system (ics) cyber attack datasets: Gas pipeline and water storage tank,” ICS Data Sets, 2014

  38. [39]

    Wustl-iiot-2021 dataset for iiot cybersecurity research,

    M. Zolanvari, “Wustl-iiot-2021 dataset for iiot cybersecurity research,” IEEE DataPort, 2022

  39. [40]

    Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning,

    M. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras, and H. Janicke, “Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning,” IEEE DataPort, 2023