pith. sign in

arxiv: 2604.23332 · v1 · submitted 2026-04-25 · 💻 cs.CR · cs.NI

Advanced Anomaly Detection and Threat Intelligence in Zero Trust IoT Environments Using Machine Learning

Pith reviewed 2026-05-08 07:52 UTC · model grok-4.3

classification 💻 cs.CR cs.NI
keywords anomaly detectionthreat intelligencezero trust IoTmachine learningSMOTEKDD Cup 1999intrusion detectionfalse positives
0
0 comments X

The pith

SMOTE significantly enhances SVM, RF, and DT performance for anomaly detection in Zero Trust IoT by balancing the KDD Cup 1999 dataset.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies machine learning to detect anomalies in IoT networks secured by Zero Trust principles. It trains Support Vector Machine, Random Forest, and Decision Tree models on the KDD Cup 1999 dataset after using SMOTE to fix class imbalance. The authors find that this step raises detection performance and cuts false positives. This matters as connected devices face more sophisticated attacks that perimeter defenses cannot handle. Additional explorations include edge machine learning and blockchain for spotting malicious links and persistent threats.

Core claim

The central discovery is that incorporating SMOTE to balance the KDD Cup 1999 dataset enables SVM, Random Forest, and Decision Tree classifiers to achieve higher accuracy in identifying network intrusions and providing threat intelligence within Zero Trust IoT settings, while also reducing false positives.

What carries the argument

SMOTE oversampling technique paired with supervised ML classifiers (SVM, RF, DT) to mitigate class imbalance in intrusion detection data.

Load-bearing premise

The KDD Cup 1999 dataset accurately simulates current IoT network intrusions and regular traffic within Zero Trust architectures.

What would settle it

A direct comparison showing that SMOTE does not improve model performance on a modern IoT intrusion dataset would falsify the effectiveness claim.

Figures

Figures reproduced from arXiv: 2604.23332 by Chiew Foong Kwong, Jawad Hussain, Muhammad Umair Basharat, Waqas Khalid.

Figure 1
Figure 1. Figure 1: Train vs Test Accuracy Comparison Although deep learning models achieved competitive accu￾racy, they did not surpass Random Forest. The RNN model showed a significantly low F1-score, indicating poor minority￾class detection in tabular intrusion data. C. Impact of SMOTE on Detection Quality The presence of class imbalance may generate false high accuracy in intrusion detection, since it is possible to predi… view at source ↗
read the original abstract

The growing adoption of IoT and cloud computing, combined with rapid advancements in digital technologies, has considerably increased the cyber-attack surface, resulting in increasingly complex and persistent attacks. Traditional security methods, primarily based on perimeter defenses, are insufficient to meet these developing threats, especially within the context of a Zero Trust Security (ZTS) architecture. This study investigates the application of sophisticated artificial intelligence (AI) and machine learning (ML) techniques, including the use of the Synthetic Minority Oversampling Technique (SMOTE), to improve anomaly detection and threat intelligence systems. This study focuses on how Support Vector Machine (SVM), Random Forest (RF), and Decision Tree (DT) classifiers might increase threat detection accuracy in IoT environments. The research endeavors to improve cybersecurity resilience by mitigating false positives and providing actionable intelligence through supervised learning algorithms. The KDD Cup 1999 dataset is used in the study to assess how well these models perform in simulating various network intrusions and regular traffic. The application of SMOTE significantly enhanced the performance of these models by addressing class imbalance, leading to improved detection accuracy. Furthermore, as supplementary methods for detecting malicious URLs and advanced persistent threats (APTs), edge-based machine learning and blockchain technology are investigated. This study addresses the shortcomings of conventional security systems and supports the growing demand for reliable threat detection in a world that is becoming more interconnected. It also advances the creation of more proactive and adaptable cybersecur

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that applying SMOTE to address class imbalance significantly improves the anomaly detection accuracy of SVM, Random Forest, and Decision Tree classifiers when evaluated on the KDD Cup 1999 dataset, and that this approach (together with edge-based ML and blockchain methods) advances threat intelligence and anomaly detection within Zero Trust IoT architectures.

Significance. If the reported SMOTE-driven accuracy gains on KDD Cup 1999 are reproducible and the dataset choice is justified, the work would supply a concrete empirical demonstration that oversampling can mitigate imbalance in supervised network-intrusion tasks. Such a result could be useful for practitioners working with legacy TCP/IP corpora, though its direct bearing on modern IoT Zero Trust deployments would still require additional bridging evidence.

major comments (2)
  1. [Abstract] Abstract: the central claim that KDD Cup 1999 'simulat[es] various network intrusions and regular traffic' inside Zero Trust IoT environments is load-bearing yet unsupported; the manuscript provides no IoT-specific feature mapping, no comparison against IoT corpora (IoT-23, UNSW-NB15 IoT subset), and no ablation showing that the SMOTE lift survives when traffic distributions reflect constrained devices or micro-segmentation.
  2. [Methods] Methods / Experimental Setup: no quantitative baseline comparisons, cross-validation details, or confusion-matrix results are referenced in the abstract, and the full text supplies no evidence that the reported performance gains are statistically significant or robust to the shift from 1999-era TCP/IP features to IoT protocols and continuous-authentication signals.
minor comments (1)
  1. [Abstract] Abstract ends abruptly with the truncated word 'cybersecur'; a complete sentence and consistent terminology for 'Zero Trust Security (ZTS)' versus 'Zero Trust IoT Environments' would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on our manuscript. We address the referee's major comments point by point below, acknowledging areas where the original submission requires clarification or expansion to better support the claims regarding the KDD Cup 1999 dataset in Zero Trust IoT contexts. We commit to revisions that strengthen the presentation without overstating the current results.

read point-by-point responses
  1. Referee: [Abstract] the central claim that KDD Cup 1999 'simulat[es] various network intrusions and regular traffic' inside Zero Trust IoT environments is load-bearing yet unsupported; the manuscript provides no IoT-specific feature mapping, no comparison against IoT corpora (IoT-23, UNSW-NB15 IoT subset), and no ablation showing that the SMOTE lift survives when traffic distributions reflect constrained devices or micro-segmentation.

    Authors: We agree that the manuscript does not provide IoT-specific feature mappings or direct comparisons to datasets such as IoT-23 or the IoT subset of UNSW-NB15, nor does it include an ablation study on constrained-device traffic patterns. The KDD Cup 1999 dataset was used as a standard, publicly available benchmark to isolate and demonstrate the effect of SMOTE on class imbalance for the SVM, Random Forest, and Decision Tree classifiers in a supervised anomaly-detection setting. In the revised version we will rephrase the abstract and introduction to present the experiments as a proof-of-concept on legacy network-intrusion data, explicitly note the absence of IoT-protocol features, and add a dedicated limitations paragraph that discusses the need for future validation on IoT corpora. We will also outline planned follow-up work that includes preliminary mappings to IoT traffic characteristics. revision: yes

  2. Referee: [Methods] no quantitative baseline comparisons, cross-validation details, or confusion-matrix results are referenced in the abstract, and the full text supplies no evidence that the reported performance gains are statistically significant or robust to the shift from 1999-era TCP/IP features to IoT protocols and continuous-authentication signals.

    Authors: We acknowledge that the abstract omits these experimental details and that the original text does not report formal statistical significance tests or robustness checks against IoT-specific protocol shifts. The revised manuscript will update the abstract to include the main quantitative outcomes (accuracy, precision, recall, and F1 improvements after SMOTE), state that 10-fold cross-validation was employed, and reference the confusion matrices already present in the results section. We will add paired statistical tests (e.g., McNemar or Wilcoxon signed-rank) to quantify the significance of the SMOTE-induced gains. Regarding transfer to IoT protocols and continuous-authentication signals, we will expand the discussion to clarify that the imbalance-handling technique is intended to be general, while explicitly stating that direct empirical validation on IoT data remains future work. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical evaluation on public dataset

full rationale

The paper conducts an empirical study applying SMOTE to SVM, RF, and DT classifiers on the KDD Cup 1999 dataset to report improved anomaly detection accuracy. No mathematical derivations, equations, functional forms, or predictions derived from fitted parameters exist in the provided text. Claims rest on experimental results from a public benchmark rather than any self-definitional reduction, fitted-input-as-prediction, or self-citation load-bearing step. Dataset relevance to IoT/Zero Trust is a separate validity issue outside circularity criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the representativeness of a 25-year-old dataset and on the unstated assumption that standard ML hyper-parameters and preprocessing choices will generalize to contemporary IoT traffic.

axioms (1)
  • domain assumption KDD Cup 1999 dataset faithfully represents modern IoT network behavior and attack patterns
    Explicitly invoked when the authors state the dataset is used to simulate intrusions and regular traffic in IoT environments.

pith-pipeline@v0.9.0 · 5569 in / 1186 out tokens · 64216 ms · 2026-05-08T07:52:04.048609+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    Blockchain-based AI-enabled industry 4.0 CPS protection against advanced persistent threat,

    Z. Rahman, X. Yi, and I. Khalil, “Blockchain-based AI-enabled industry 4.0 CPS protection against advanced persistent threat,”IEEE Internet Things J., vol. 10, no. 8, pp. 6769–6778, 2022

  2. [2]

    Security threats and artificial intelligence based countermeasures for internet of things networks: A comprehensive survey,

    S. Zaman, K. Alhazmi, M. A. Aseeri, M. R. Ahmed, R. T. Khan, M. S. Kaiser, and M. Mahmud, “Security threats and artificial intelligence based countermeasures for internet of things networks: A comprehensive survey,”IEEE Access, vol. 9, pp. 94668–94690, 2021

  3. [3]

    Machine learning in network anomaly detection: A survey,

    S. Wang, J. F. Balarezo, S. Kandeepan, A. Al-Hourani, K. G. Chavez, and B. Rubinstein, “Machine learning in network anomaly detection: A survey,”IEEE Access, vol. 9, pp. 152379–152396, 2021

  4. [4]

    Machine learning for anomaly detection: A systematic review,

    A. B. Nassif, M. A. Talib, Q. Nasir, and F. M. Dakalbab, “Machine learning for anomaly detection: A systematic review,”IEEE Access, vol. 9, pp. 78658–78700, 2021

  5. [5]

    Detecting cybersecurity attacks in internet of things using artificial intelligence methods: A systematic literature review,

    M. Abdullahi, Y . Baashar, H. Alhussian, A. Alwadain, N. Aziz, L. F. Capretz, and S. J. Abdulkadir, “Detecting cybersecurity attacks in internet of things using artificial intelligence methods: A systematic literature review,”Electronics, vol. 11, no. 2, p. 198, 2022

  6. [6]

    Theory and application of zero trust security: A brief survey,

    H. Kang, G. Liu, Q. Wang, L. Meng, and J. Liu, “Theory and application of zero trust security: A brief survey,”Entropy, vol. 25, no. 12, p. 1595, 2023

  7. [7]

    A review of machine learning and deep learning techniques for anomaly detection in IoT data,

    R. Al-Amri, R. K. Murugesan, M. Man, A. F. Abdulateef, M. A. Al- Sharafi, and A. A. Alkahtani, “A review of machine learning and deep learning techniques for anomaly detection in IoT data,”Appl. Sci., vol. 11, no. 12, p. 5320, 2021

  8. [8]

    Sharing machine learning models as indicators of compromise for cyber threat intelligence,

    D. Preuveneers and W. Joosen, “Sharing machine learning models as indicators of compromise for cyber threat intelligence,”J. Cybersecurity Privacy, vol. 1, no. 1, pp. 140–163, 2021

  9. [9]

    Cyber threat intelligence for IoT using machine learning,

    S. Mishra, A. Albarakati, and S. K. Sharma, “Cyber threat intelligence for IoT using machine learning,”Processes, vol. 10, no. 12, p. 2673, 2022

  10. [10]

    Artificial intelligence-based cyber security in the context of industry 4.0: A survey,

    A. J. G. de Azambuja, C. Plesker, K. Sch ¨utzer, R. Anderl, B. Schleich, and V . R. Almeida, “Artificial intelligence-based cyber security in the context of industry 4.0: A survey,”Electronics, vol. 12, no. 8, p. 1920, 2023

  11. [11]

    Never trust, always verify: A multivocal literature review on current knowledge and research gaps of zero-trust,

    C. Buck, C. Olenberger, A. Schweizer, F. V ¨olter, and T. Eymann, “Never trust, always verify: A multivocal literature review on current knowledge and research gaps of zero-trust,”Comput. Security, vol. 110, p. 102436, 2021

  12. [12]

    A zero-trust architecture for remote access in industrial IoT infrastructures,

    F. Federici, D. Martintoni, and V . Senni, “A zero-trust architecture for remote access in industrial IoT infrastructures,”Electronics, vol. 12, no. 3, p. 566, 2023

  13. [13]

    Future industry internet of things with zero-trust security,

    S. Li, M. Iqbal, and N. Saxena, “Future industry internet of things with zero-trust security,”Inf. Syst. Frontiers, pp. 1–14, 2022

  14. [14]

    Access control enforcement in IoT: State of the art and open challenges in the zero trust era,

    P. Colombo, E. Ferrari, and E. D. T ¨umer, “Access control enforcement in IoT: State of the art and open challenges in the zero trust era,” in Proc. 2021 Third IEEE Int. Conf. Trust, Privacy Secur. Intell. Syst. Appl. (TPS-ISA), 2021, pp. 159–166

  15. [15]

    Random forest modelling for network intrusion detection system,

    N. Farnaaz and M. A. Jabbar, “Random forest modelling for network intrusion detection system,”Procedia Comput. Sci., vol. 89, pp. 213– 217, 2016

  16. [16]

    Establishing a zero trust strategy in cloud computing environment,

    S. Mehraj and M. T. Banday, “Establishing a zero trust strategy in cloud computing environment,” inProc. 2020 Int. Conf. Comput. Commun. Informat. (ICCCI), 2020, pp. 1–6

  17. [17]

    Network intrusion detection using random forests,

    J. Zhang and M. Zulkernine, “Network intrusion detection using random forests,” inProc. 2005 Int. Conf. Privacy, Secur. Trust (PST), 2005, pp. 8–15

  18. [18]

    KNN model-based approach in classification,

    G. Guo, H. Wang, D. Bell, Y . Bi, and K. Greer, “KNN model-based approach in classification,” inOn the Move to Meaningful Internet Systems 2003. Berlin, Heidelberg: Springer, 2003, pp. 986–996

  19. [19]

    Cyber threat intelligence-based malicious URL detection model using ensemble learning,

    M. Alsaedi, F. A. Ghaleb, F. Saeed, J. Ahmad, and M. Alasli, “Cyber threat intelligence-based malicious URL detection model using ensemble learning,”Sensors, vol. 22, no. 9, p. 3373, 2022

  20. [20]

    Explainable artificial intelligence applications in cyber security: State- of-the-art in research,

    Z. Zhang, H. Al Hamadi, E. Damiani, C. Y . Yeun, and F. Taher, “Explainable artificial intelligence applications in cyber security: State- of-the-art in research,”IEEE Access, vol. 10, pp. 93104–93139, 2022

  21. [21]

    Machine learning-based anomaly detection using K-mean array and sequential minimal optimization,

    S. Gadal, R. Mokhtar, M. Abdelhaq, R. Alsaqour, E. S. Ali, and R. Saeed, “Machine learning-based anomaly detection using K-mean array and sequential minimal optimization,”Electronics, vol. 11, no. 14, p. 2158, 2022

  22. [22]

    Network intrusion detection based on random forest and support vector machine,

    Y . Chang, W. Li, and Z. Yang, “Network intrusion detection based on random forest and support vector machine,” inProc. 2017 IEEE Int. Conf. Comput. Sci. Eng. (CSE) Embedded Ubiquitous Comput. (EUC), 2017, pp. 635–638

  23. [23]

    Network traffic anomaly detection via deep learning,

    K. Fotiadou, T. H. Velivassaki, A. V oulkidis, D. Skias, S. Tsekeridou, and T. Zahariadis, “Network traffic anomaly detection via deep learning,” Information, vol. 12, no. 5, p. 215, 2021