pith. sign in

arxiv: 2607.01194 · v1 · pith:VAOKE5KYnew · submitted 2026-07-01 · 💻 cs.CR

Detecting Adversarial Evasion Attacks Against Autoencoder-Based Network Intrusion Detection Systems

Pith reviewed 2026-07-02 10:17 UTC · model grok-4.3

classification 💻 cs.CR
keywords adversarial detectionnetwork intrusion detectionautoencodersevasion attacksIoT securityreconstruction errorsperturbation consistencyPANDA framework
0
0 comments X

The pith

Two detectors detect adversarial evasion attacks on autoencoder NIDS by tracking reconstruction errors and feature inconsistencies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes two detectors to identify adversarial examples generated by the PANDA framework against autoencoder-based network intrusion detection systems. These attacks convert network traffic to images to apply vision-domain perturbations without changing attack semantics. The Residual Localisation Detector monitors spatial concentration of errors in inter-arrival time regions of image space, while the Feature-Space Perturbation Consistency Detector checks packet-level features directly. Evaluation on IoT traffic shows both achieve detection metrics of 0.99 or higher for true negative rate, true positive rate, precision, recall, and F1-score. This approach combines reconstruction scoring with consistency checks to defend against such transferred attacks.

Core claim

The Residual Localisation Detector and the Feature-Space Perturbation Consistency Detector achieve near-perfect detection performance with TNR, TPR, precision, recall, and F1-score all at or above 0.99 against PANDA-generated adversarial examples on benign, malicious, and adversarial traffic from the UQ-IoT dataset.

What carries the argument

The Residual Localisation Detector (RLD), which tracks the spatial concentration of reconstruction errors in the inter-arrival time feature region in image space, and the Feature-Space Perturbation Consistency (FPC) Detector, which operates on packet-level inter-arrival time features in packet-feature space.

If this is right

  • Adversarial examples can be distinguished from benign and malicious traffic using spatial error concentration in image space.
  • Packet-level feature inconsistency provides an independent check for perturbations.
  • Integration of reconstruction-based scoring with perturbation consistency checks provides a practical defence for NIDS.
  • High detection performance holds across multiple IoT devices in the evaluated dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar detectors could be tested on other network datasets beyond UQ-IoT to check generalizability.
  • The approach might extend to other types of ML-based NIDS beyond autoencoders if similar perturbation patterns occur.
  • Combining image-space and feature-space checks could reduce false positives in real-world deployments.

Load-bearing premise

PANDA-generated adversarial perturbations produce detectable spatial concentration of reconstruction errors in the inter-arrival time region and inconsistencies in packet-level features.

What would settle it

A test where the detectors fail to achieve TNR, TPR, precision, recall, or F1-score of at least 0.99 on adversarial examples from the UQ-IoT dataset would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2607.01194 by Ashim Siwakoti, Niklas Bunzel.

Figure 1
Figure 1. Figure 1: Residual Localisation Detector (RLD) pipeline. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Residual heatmaps for clean vs. adversarial traffic [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Feature-Space Perturbation Consistency (FPC) de [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Effectiveness of adversarial perturbation on in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: FPC reconstruction errors for clean vs. adversarial [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FGSM sensitivity to 𝜖. Reconstruction error in￾creases from 𝜖 = 0.1 to 0.5, then decreases. Optimal: 𝜖 = 0.5. [11] Mohammad J. Hashemi, Greg Cusack, and Eric Keller. 2019. Towards Evaluation of NIDSs in Adversarial Setting. In Proceedings of the 3rd ACM CoNEXT Workshop on Big DAta, Machine Learning and Artificial Intelligence for Data Communication Networks. ACM, 14–21. https://doi.org/10.1145/3359992.3366… view at source ↗
read the original abstract

Evasion attacks deliberately manipulate input to an ML-based system to produce an incorrect prediction while the manipulated input still appears benign. The PANDA framework has demonstrated that adversarial examples developed for the vision domain can be transferred to the network domain by converting packet sequences into invertible grayscale images, enabling gradient-based attacks such as masked FGSM against autoencoder-based network intrusion detection systems (NIDS). These attacks manipulate the NIDS anomaly score without altering the underlying attack semantics, leaving defenders without a straightforward way to distinguish between benign flows and carefully perturbed malicious traffic. In this paper, we propose two complementary detectors: the Residual Localisation Detector (RLD), which tracks the spatial concentration of reconstruction errors in the inter-arrival time feature region in image space; and the Feature-Space Perturbation Consistency (FPC) Detector, which operates directly on packet-level inter-arrival time features in packet-feature space. We evaluate both detectors on benign, malicious, and adversarial traffic from multiple IoT devices in the UQ-IoT dataset. Both detectors achieve near-perfect detection performance (TNR, TPR, precision, recall, and F1-score $\geq 0.99$) against adversarial examples across the evaluated IoT traffic. Our results indicate that integrating reconstruction-based scoring with perturbation consistency checks, in both image space and packet-feature space, offers a practical defence against emerging PANDA-style adversarial attacks on NIDS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes two detectors—the Residual Localisation Detector (RLD), which tracks spatial concentration of reconstruction errors in the inter-arrival time region of image space, and the Feature-Space Perturbation Consistency (FPC) Detector, which checks packet-level feature consistency—to identify PANDA-generated adversarial examples against autoencoder-based NIDS. Evaluation on benign, malicious, and adversarial traffic from the UQ-IoT dataset reports near-perfect performance with TNR, TPR, precision, recall, and F1-score all ≥ 0.99.

Significance. If the results hold, the work offers a practical, representation-specific defense against transferred adversarial attacks on network anomaly detection by exploiting perturbation artifacts in both image and packet-feature spaces. It introduces two deterministic detectors that complement reconstruction scoring and could be integrated into existing autoencoder NIDS without requiring retraining.

major comments (2)
  1. [Abstract and Evaluation] Abstract and Evaluation (implied §4–5): The reported metrics ≥ 0.99 are shown only for non-adaptive masked-FGSM attacks transferred from vision. No experiments test adaptive adversaries who jointly optimize to fool the autoencoder while also keeping RLD error maps diffuse in the inter-arrival region and FPC consistency scores high. Because both detectors are deterministic functions of the same image/feature representation the attack manipulates, such joint optimization is feasible and directly tests the central claim of a practical defence.
  2. [§3] §3 (Detector Formulations): The RLD and FPC are presented as exploiting inherent properties of PANDA perturbations, yet the manuscript provides no analysis or bounds showing that an adversary cannot simultaneously minimize reconstruction error concentration and feature inconsistency while still evading the base autoencoder. This assumption is load-bearing for the claim that the detectors reliably detect PANDA-style attacks.
minor comments (2)
  1. [Abstract] The abstract states results across 'multiple IoT devices' but does not list the exact devices, train/test split ratios, or attack generation parameters (ε, mask size), which are needed for reproducibility.
  2. [§3] Notation for the RLD error map and FPC consistency metric would benefit from explicit equations rather than prose descriptions alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major comment below, indicating where revisions will be made to the manuscript.

read point-by-point responses
  1. Referee: [Abstract and Evaluation] The reported metrics ≥ 0.99 are shown only for non-adaptive masked-FGSM attacks transferred from vision. No experiments test adaptive adversaries who jointly optimize to fool the autoencoder while also keeping RLD error maps diffuse in the inter-arrival region and FPC consistency scores high. Because both detectors are deterministic functions of the same image/feature representation the attack manipulates, such joint optimization is feasible and directly tests the central claim of a practical defence.

    Authors: We agree that evaluation against adaptive adversaries who explicitly target both the base autoencoder and the proposed detectors would strengthen the claims. Our experiments follow the PANDA threat model of transferred, non-adaptive attacks. Joint optimization is possible in principle but requires balancing multiple conflicting objectives (low reconstruction error at the NIDS while diffusing residuals in the inter-arrival region and preserving packet-feature consistency), which may not be trivial given the deterministic nature of RLD and FPC. We will add a dedicated limitations paragraph discussing this gap and the practical difficulties of such adaptive attacks. No new experiments are planned for the revision. revision: partial

  2. Referee: [§3] The RLD and FPC are presented as exploiting inherent properties of PANDA perturbations, yet the manuscript provides no analysis or bounds showing that an adversary cannot simultaneously minimize reconstruction error concentration and feature inconsistency while still evading the base autoencoder. This assumption is load-bearing for the claim that the detectors reliably detect PANDA-style attacks.

    Authors: The detectors are introduced on the basis of empirical observations of reconstruction-error concentration and feature inconsistency produced by the evaluated PANDA perturbations. The manuscript does not claim or prove that no adversary can ever evade all three components simultaneously. We will revise the wording in §3 to clarify that RLD and FPC are heuristic detectors motivated by observed artifacts rather than theoretically guaranteed to be evasion-proof, and we will explicitly note the absence of formal bounds. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper defines RLD and FPC detectors from observable properties of PANDA perturbations (spatial error concentration in inter-arrival image region; packet-feature inconsistency) and reports empirical performance on distinct benign/malicious/adversarial traffic splits from the UQ-IoT dataset. No equations reduce the claimed TPR/TNR/precision/recall/F1 ≥ 0.99 to a fitted parameter or self-referential definition drawn from the same evaluation data; the detectors are deterministic functions applied to held-out examples rather than retrained or tuned on the adversarial set itself. External PANDA citations supply the attack method but do not load-bear the detection result. The derivation is therefore self-contained against the paper's own benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

The central claim rests on the existence of localized reconstruction errors and feature inconsistency under PANDA attacks; no free parameters or invented physical entities are introduced.

invented entities (2)
  • Residual Localisation Detector (RLD) no independent evidence
    purpose: Detects adversarial perturbations via spatial concentration of reconstruction errors in image space
    New method introduced by the paper with no independent prior validation cited.
  • Feature-Space Perturbation Consistency (FPC) Detector no independent evidence
    purpose: Detects attacks by checking consistency of packet-level inter-arrival time features
    New method introduced by the paper with no independent prior validation cited.

pith-pipeline@v0.9.1-grok · 5781 in / 1093 out tokens · 25751 ms · 2026-07-02T10:17:22.421335+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 22 canonical work pages · 6 internal anchors

  1. [1]

    Arwa Aldweesh, Abdelouahid Derhab, and Ahmed Z. Emam. 2020. Deep learning approaches for anomaly-based intrusion detection systems: A survey, taxonomy, and open issues.Knowledge-Based Systems189 (2020), 105124. https://doi.org/ 10.1016/j.knosys.2019.105124

  2. [2]

    Mohammad Arafah, Iain Phillips, Asma Adnane, Wael Hadi, Mohammad Alau- thman, and Abedal-Kareem Al-Banna. 2025. Anomaly-based network intrusion detection using denoising autoencoder and Wasserstein GAN synthetic attacks. Applied Soft Computing168 (2025), 112455. https://doi.org/10.1016/j.asoc.2024. 112455

  3. [3]

    Battista Biggio and Fabio Roli. 2018. Wild patterns: Ten years after the rise of adversarial machine learning.Pattern Recognition84 (Dec. 2018), 317–331. https://doi.org/10.1016/j.patcog.2018.07.023

  4. [4]

    Zhaomin Chen, Chai Kiat Yeo, Bu Sung Lee, and Chiew Tong Lau. 2018. Autoencoder-based network anomaly detection. In2018 Wireless telecommu- nications symposium (WTS). IEEE, 1–5

  5. [5]

    Jacopo Cortellazzi, Feargus Pendlebury, Daniel Arp, Erwin Quiring, Fabio Pier- azzi, and Lorenzo Cavallaro. 2024. Intriguing Properties of Adversarial ML Attacks in the Problem Space [Extended Version]. arXiv:1911.02142 [cs.CR] https://arxiv.org/abs/1911.02142

  6. [6]

    Ambra Demontis, Marco Melis, Maura Pintor, Matthew Jagielski, Battista Biggio, Alina Oprea, Cristina Nita-Rotaru, and Fabio Roli. 2019. Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks. arXiv:1809.02861 [cs.LG] https://arxiv.org/abs/1809.02861

  7. [7]

    Mohamed elShehaby and Ashraf Matrawy. 2025. SoK: Adversarial Evasion Attacks Practicality in NIDS Domain and the Impact of Dynamic Learning. arXiv:2306.05494 [cs.CR] https://arxiv.org/abs/2306.05494

  8. [8]

    Sabrine Ennaji, Fabio De Gaspari, Dorjan Hitaj, Alicia Kbidi, and Luigi V. Mancini

  9. [9]

    arXiv:2409.18736 [cs.CR] https://arxiv.org/abs/ 2409.18736

    Adversarial Challenges in Network Intrusion Detection Systems: Research Insights and Future Prospects. arXiv:2409.18736 [cs.CR] https://arxiv.org/abs/ 2409.18736

  10. [10]

    Explaining and Harnessing Adversarial Examples

    Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. arXiv:1412.6572 [stat.ML] https://arxiv.org/ abs/1412.6572

  11. [11]

    Dongqi Han, Zhiliang Wang, Ying Zhong, Wenqi Chen, Jiahai Yang, Shuqiang Lu, Xingang Shi, and Xia Yin. 2020. Practical Traffic-space Adversarial Attacks on Learning-based NIDSs. https://doi.org/10.48550/arXiv.2005.07519 Detecting Adversarial Evasion Attacks Against Autoencoder-Based Network Intrusion Detection Systems (a)𝜖=0.1 (b)𝜖=0.3 (c)𝜖=0.5 (d)𝜖=0.7 (...

  12. [12]

    Hashemi, Greg Cusack, and Eric Keller

    Mohammad J. Hashemi, Greg Cusack, and Eric Keller. 2019. Towards Evaluation of NIDSs in Adversarial Setting. InProceedings of the 3rd ACM CoNEXT Workshop on Big DAta, Machine Learning and Artificial Intelligence for Data Communication Networks. ACM, 14–21. https://doi.org/10.1145/3359992.3366642

  13. [13]

    Ke He, Dan Kim, Zhien Zhang, Mengmeng Ge, Ulysses Lam, and Jiaqi Yu. 2022. UQ IoT IDS Dataset 2021. https://doi.org/10.48610/17b44bb

  14. [14]

    Ke He, Dan Dongseong Kim, and Muhammad Rizwan Asghar. 2023. Adversarial Machine Learning for Network Intrusion Detection Systems: A Comprehensive Survey.IEEE Communications Surveys & Tutorials25, 1 (2023), 538–566. https: //doi.org/10.1109/COMST.2022.3233793

  15. [15]

    Vivek Kumar, Kamal Kumar, Maheep Singh, and Neeraj Kumar. 2025. NIDS-DA: Detecting functionally preserved adversarial examples for network intrusion detection system using deep autoencoders.Expert Systems with Applications270 (2025), 126513. https://doi.org/10.1016/j.eswa.2025.126513

  16. [16]

    Aditya Kuppa, Slawomir Grzonkowski, Muhammad Rizwan Asghar, and Nhien- An Le-Khac. 2019. Black Box Attacks on Deep Anomaly Detectors. InProceedings of the 14th International Conference on A vailability, Reliability and Security. ACM, Article 21. https://doi.org/10.1145/3339252.3339266

  17. [17]

    Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. arXiv:1607.02533 [cs.CV] https://arxiv.org/abs/1607.02533

  18. [18]

    Yao Li, Minhao Cheng, Cho-Jui Hsieh, and Thomas C. M. Lee. 2022. A Review of Adversarial Attack and Defense for Classification Methods.The American Statistician76, 4 (Jan. 2022), 329–345. https://doi.org/10.1080/00031305.2021. 2006781

  19. [19]

    Hung-Jen Liao, Chun-Hung Richard Lin, Ying-Chih Lin, and Kuang-Yuan Tung

  20. [20]

    https://doi.org/10.1016/j.jnca

    Intrusion detection system: A comprehensive review.Journal of Network and Computer Applications36, 1 (2013), 16–24. https://doi.org/10.1016/j.jnca. 2012.09.004

  21. [21]

    Zilong Lin, Yong Shi, and Zhi Xue. 2022. IDSGAN: Generative Adversarial Networks for Attack Generation Against Intrusion Detection. InAdvances in Knowledge Discovery and Data Mining. Springer International Publishing, 79–91. https://doi.org/10.1007/978-3-031-05981-0_7

  22. [22]

    Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, and Asaf Shabtai. 2018. Kit- sune: An Ensemble of Autoencoders for Online Network Intrusion Detection. arXiv:1802.09089 [cs.CR] https://arxiv.org/abs/1802.09089

  23. [23]

    Robin Sommer and Vern Paxson. 2010. Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. In2010 IEEE Symposium on Security and Privacy. 305–316. https://doi.org/10.1109/SP.2010.25

  24. [24]

    Youngrok Song, Sangwon Hyun, and Yun-Gyung Cheong. 2021. Analysis of Autoencoders for Network Intrusion Detection.Sensors21, 13 (2021). https: //doi.org/10.3390/s21134294

  25. [25]

    Subrat Kumar Swain, Vireshwar Kumar, Guangdong Bai, and Dan Dongseong Kim. 2024. PANDA: Practical Adversarial Attack Against Network Intrusion Detection. In2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks-Supplemental Volume (DSN-S). IEEE, 28–32

  26. [26]

    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. arXiv:1312.6199 [cs.CV] https://arxiv.org/abs/1312.6199

  27. [27]

    Yihang Zhang, Yingwen Wu, and Xiaolin Huang. 2024. Toward transferable adversarial attacks against autoencoder-based network intrusion detectors.IEEE Transactions on Industrial Informatics(2024). A FGSM SENSITIVITY TO𝜖