pith. sign in

arxiv: 2605.04250 · v2 · submitted 2026-05-05 · 💻 cs.CR · cs.NI

Binary Image-Based Intrusion Detection for Operational Technology Networks: Extending the SPHBI Methodology from IoT to Modbus TCP

Pith reviewed 2026-05-08 17:29 UTC · model grok-4.3

classification 💻 cs.CR cs.NI
keywords intrusion detectionModbus TCPbinary imagesoperational technologynetwork securitymachine learningSCADAlightweight detection
0
0 comments X

The pith

Adding eight application-layer bytes to binary packet images enables 98.1% accurate intrusion detection in Modbus TCP networks with only 63 parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends the SPHBI method from IoT to Modbus TCP by testing five levels of protocol information depth on a large dataset of 11.4 million packets. It demonstrates that header-only approaches fail in uniform OT environments but minimal application data inclusion yields high binary and multiclass accuracy with tiny model sizes. This matters for OT networks because it allows real-time per-packet classification on constrained devices without heavy computation. If correct, it shows single-packet methods can cover most attack types except replays.

Core claim

Extending SPHBI to Modbus TCP reveals TCP/IP headers alone provide only 51.8% binary accuracy due to lack of heterogeneity in SCADA traffic. Incorporating eight bytes of application-layer information raises binary accuracy to 98.1% using a model with just 63 parameters. The best model achieves 94.4% multiclass accuracy on nine classes with 56,873 parameters, about 430 times smaller than ResNet50 approaches, and detects seven of eight attack types with over 94% recall while replay remains undetectable from single packets.

What carries the argument

The Single Packet Header Binary Image (SPHBI) approach, which transforms packet headers and limited payload bytes into binary images for classification by small convolutional networks, applied at varying protocol depths.

If this is right

  • TCP/IP headers prove insufficient for OT intrusion detection, unlike in IoT.
  • Minimal application data enables near-perfect binary detection with extremely low parameters suitable for edge devices.
  • Multiclass performance reaches 94.4% with a confidence interval of 92.9% to 95.9% across 10 random seeds.
  • Seven attack types are reliably detected while replay attacks require multi-packet analysis.
  • Models use far fewer parameters than image-based deep learning baselines like ResNet50.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar binary image methods may apply to other uniform-header protocols in industrial control systems.
  • Hybrid systems combining single-packet SPHBI with sequence analysis could handle replay attacks.
  • Deployment on actual OT hardware would test if the reported accuracies hold under live traffic variations.
  • The parameter efficiency suggests potential for on-device training or frequent updates in resource-limited settings.

Load-bearing premise

The CIC Modbus 2023 dataset represents typical real-world Modbus TCP traffic and the specific attack behaviors, with single-packet binary images providing enough information to distinguish attacks.

What would settle it

Evaluating the same models on a new Modbus TCP dataset collected from a different industrial environment or including more replay attack instances, and observing whether binary accuracy falls significantly below 98% or multiclass accuracy below 90%.

Figures

Figures reproduced from arXiv: 2605.04250 by Aamir Omar.

Figure 2
Figure 2. Figure 2: Multiclass CNN architecture for Approach 2b (16×15 input, 56,873 parameters) The original SPHBI multiclass classifier uses four convolutional layers with Sig￾moid activation. When applied to input images larger than 8×8, this architecture suf￾fered from vanishing gradients: training loss remained effectively constant across 100 epochs and the model collapsed to predicting the majority class. All approaches… view at source ↗
Figure 4
Figure 4. Figure 4: Normalised confusion matrix for Approach 2b multiclass (cap = 5,000). Rows represent true labels; columns represent predicted labels. Replay and length manipulation are the only clas￾ses with recall below 50% view at source ↗
Figure 5
Figure 5. Figure 5: Multiclass accuracy by training cap for Approaches 2b, 3 and 3b (10 seeds per point, shaded regions show 95% confidence intervals). 5.4 Hyperparameter Sensitivity A 2⁴ factorial experiment (32 single-seed runs) confirmed that binary classification is insensitive to training configuration (0.02pp spread). For multiclass, the factorial iden￾tified Sigmoid activation with batch normalisation as the best singl… view at source ↗
read the original abstract

This paper extends the Single Packet Header Binary Image (SPHBI) intrusion detection methodology from IoT to Modbus TCP, evaluating five approaches spanning a gradient of protocol depth on the CIC Modbus 2023 dataset (11.4 million packets, eight detectable attack types). TCP/IP headers alone achieve only 51.8% binary accuracy, confirming that header-level heterogeneity exploited in IoT traffic is absent in uniform SCADA environments. Adding eight bytes of application-layer information improves binary accuracy to 98.1% with just 63 parameters, directly relevant to per-packet classification on resource-constrained OT edge devices. The best-performing approach achieves 94.4% +/- 2.2pp multiclass accuracy across nine classes (95% CI [92.9%, 95.9%], 10 seeds) with 56,873 parameters, roughly 430 times fewer than comparable ResNet50-based approaches. Per-class recall analysis shows seven of eight detectable attack types identified with recall above 94%, while replay attacks remain structurally undetectable by any single-packet method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper extends the Single Packet Header Binary Image (SPHBI) methodology from IoT to Modbus TCP operational technology networks. It evaluates five approaches with increasing protocol depth (from TCP/IP headers alone to full application-layer inclusion) on the CIC Modbus 2023 dataset of 11.4 million packets covering eight attack types. TCP/IP headers yield only 51.8% binary accuracy, while adding eight application-layer bytes raises binary accuracy to 98.1% using 63 parameters. The best approach achieves 94.4% ± 2.2pp multiclass accuracy (95% CI [92.9%, 95.9%], 10 seeds) across nine classes with 56,873 parameters—roughly 430 times fewer than ResNet50 baselines. Per-class analysis shows high recall (>94%) for seven of eight detectable attacks, with replay attacks identified as structurally undetectable by single-packet classifiers.

Significance. If the empirical results hold under scrutiny, the work offers a lightweight, parameter-efficient intrusion detection method tailored to resource-constrained OT edge devices. It demonstrates that uniform SCADA traffic lacks the header heterogeneity exploited in IoT settings, but that minimal application-layer bytes suffice for strong binary detection. The dramatic reduction in model size relative to deep learning baselines, combined with explicit acknowledgment of single-packet limitations, provides a practical foundation for real-time per-packet classification in industrial control systems.

major comments (2)
  1. [Methods/Experimental Setup] Methods/Experimental Setup: The abstract reports concrete performance numbers (98.1% binary accuracy, 94.4% ± 2.2pp multiclass with 95% CI from 10 seeds) and parameter counts (63 and 56,873), yet the manuscript provides no details on model architectures, training procedures, data splits, preprocessing, or hyperparameter selection. This absence is load-bearing for the central performance claims, as it prevents verification that the reported accuracies and efficiency gains are not artifacts of undisclosed implementation choices or selection effects.
  2. [Dataset and Evaluation] Dataset and Evaluation sections: The headline claims rest on the assumption that the CIC Modbus 2023 corpus faithfully represents real-world Modbus TCP traffic distributions, attack timing, and payload variability. The paper itself notes that replay attacks are structurally invisible to any single-packet classifier, but offers no cross-dataset validation, live-network testing, or analysis of whether other attack classes similarly require flow-level or temporal context absent from the binary-image representation. This directly affects the practical applicability asserted for OT networks.
minor comments (2)
  1. [Abstract/Introduction] The abstract and results would benefit from an explicit statement of how the binary images are constructed (e.g., exact packet-to-image mapping, dimensions, and normalization) to aid reproducibility.
  2. [Results] Table or figure captions for per-class recall results should include the exact number of samples per class to contextualize the reported recalls above 94%.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments identify key gaps in reproducibility and evaluation scope that we address below. We commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: Methods/Experimental Setup: The abstract reports concrete performance numbers (98.1% binary accuracy, 94.4% ± 2.2pp multiclass with 95% CI from 10 seeds) and parameter counts (63 and 56,873), yet the manuscript provides no details on model architectures, training procedures, data splits, preprocessing, or hyperparameter selection. This absence is load-bearing for the central performance claims, as it prevents verification that the reported accuracies and efficiency gains are not artifacts of undisclosed implementation choices or selection effects.

    Authors: We agree that the absence of these details hinders reproducibility and verification. The manuscript as submitted emphasizes the methodology and empirical outcomes but does not include the implementation specifics. In the revised version we will add a dedicated subsection in Methods that specifies: the neural network architectures for each of the five protocol-depth approaches (layer types, dimensions, activations); training procedures (optimizer, learning rate schedule, batch size, epochs, early stopping); data splits (train/validation/test proportions and how the 11.4 million packets were partitioned); preprocessing steps (byte extraction from Modbus TCP packets, binary-image construction, and normalization); and hyperparameter selection (any tuning process or fixed values). We will also document the ten random seeds used for the reported confidence intervals. These additions will allow independent reproduction of the accuracy and parameter-count results. revision: yes

  2. Referee: Dataset and Evaluation sections: The headline claims rest on the assumption that the CIC Modbus 2023 corpus faithfully represents real-world Modbus TCP traffic distributions, attack timing, and payload variability. The paper itself notes that replay attacks are structurally invisible to any single-packet classifier, but offers no cross-dataset validation, live-network testing, or analysis of whether other attack classes similarly require flow-level or temporal context absent from the binary-image representation. This directly affects the practical applicability asserted for OT networks.

    Authors: The referee correctly identifies a genuine limitation. CIC Modbus 2023 is a large, publicly available labeled corpus, yet it remains a single source whose fidelity to operational OT traffic cannot be assumed without further evidence. The manuscript already states that replay attacks are undetectable by any single-packet method; we will expand this point. In revision we will insert an extended Limitations and Future Work subsection that (i) analyzes which of the remaining attack classes may also benefit from temporal or flow-level features beyond the current binary-image representation, (ii) explicitly notes the reliance on one dataset and the absence of cross-dataset or live-network validation, and (iii) outlines planned extensions such as multi-packet aggregation. We cannot retroactively perform live-network testing or new cross-dataset experiments within the current study scope, but the added discussion will better bound the claimed applicability to resource-constrained OT edge devices. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical results on external dataset are self-contained

full rationale

The paper reports direct empirical measurements of binary and multiclass accuracy for five protocol-depth variants of a binary-image classifier, evaluated on the named external CIC Modbus 2023 corpus (11.4 million packets). No equations, fitted parameters, or predictions are defined in terms of the target metrics; the reported figures (98.1% binary accuracy with 63 parameters, 94.4% multiclass with 56,873 parameters) are obtained via standard training/testing splits and cross-seed averaging. References to the prior SPHBI methodology serve only to describe the extension being tested and do not supply load-bearing premises that reduce the new results to self-citation or self-definition. The central claims therefore rest on independent experimental content rather than any of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central performance claims rest on the representativeness of the CIC Modbus 2023 dataset and the assumption that binary images of limited-depth packet data capture discriminative features for the attack types present.

free parameters (1)
  • Trainable model parameters
    The 56,873 and 63 parameter counts are the sizes of the neural networks whose weights are fitted to the training portion of the dataset.
axioms (1)
  • domain assumption Binary image representations of packet headers plus limited application data contain sufficient information to distinguish normal Modbus TCP traffic from the eight attack types in the dataset.
    This is the foundational premise of the SPHBI methodology being extended to the new protocol.

pith-pipeline@v0.9.0 · 5487 in / 1356 out tokens · 35174 ms · 2026-05-08T17:29:24.466529+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

10 extracted references · 5 canonical work pages

  1. [1]

    Anthi, E., Williams, L., Burnap, P., Jones, K.: A three -tiered intrusion detection system for industrial control systems. J. Cybersecur. 7(1), tyab006 (2021). https://doi.org/10.1093/cyb- sec/tyab006

  2. [2]

    In: ICC Workshops 2019, pp

    Baptista, I., Shiaeles, S., Kolokotronis, N.: A novel malware detection system based on ma- chine learning and binary visualization. In: ICC Workshops 2019, pp. 1–6. IEEE (2019)

  3. [3]

    In: 20th International Confer- ence on Privacy, Security and Trust (PST), pp

    Boakye-Boateng, K., Ghorbani, A.A., Lashkari, A.H.: Securing substations with trust, risk posture and multi-agent systems: a comprehensive approach. In: 20th International Confer- ence on Privacy, Security and Trust (PST), pp. 1 –12. IEEE, Copenhagen (2023). https://doi.org/10.1109/PST58708.2023.10320154

  4. [4]

    Cybersecurity 8(104), 1 –23 (2025)

    El-Sherif, M., Khattab, A., El-Soudani, M.: Intrusion detection using TCP/IP single packet header binary image for IoT networks. Cybersecurity 8(104), 1 –23 (2025). https://doi.org/10.1186/s42400-025-00441-x

  5. [5]

    IEEE Access 10, 40281–40306 (2022)

    Ferrag, M.A., Friha, O., Hamouda, D., Maglaras, L., Janicke, H.: Edge-IIoTset: a new com- prehensive realistic cyber security dataset of IoT and IIoT applications for centralized and federated learning. IEEE Access 10, 40281–40306 (2022)

  6. [6]

    Kotsiopoulos, T., Radoglou -Grammatikis, P., Lekka, Z., Mladenov, V., Sarigiannidis, P.: Defending industrial internet of things against Modbus/TCP threats: a combined AI -based detection and SDN -based mitigation solution. Int. J. Inf. Secur. 24(157) (2025). https://doi.org/10.1007/s10207-025-01076-2 Binary Image-Based Intrusion Detection for Operational...

  7. [7]

    In: Proceedings of the 8th International Symposium on Visualiza- tion for Cyber Security (VizSec), pp

    Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualiza- tion for Cyber Security (VizSec), pp. 1–7. ACM (2011)

  8. [8]

    In: ITASEC 2024

    Russo, S., Zanasi, C., Marasco, I.: Feature extraction for anomaly detection in industrial control systems. In: ITASEC 2024. CEUR Workshop Proceedings, vol. 3731, paper 22 (2024)

  9. [9]

    Termanini, A., Al -Abri, D., Bourdoucen, H., Al Maashri, A.: Using machine learning to detect network intrusions in industrial control systems: a survey. Int. J. Inf. Secur. (2024). https://doi.org/10.1007/s10207-024-00916-x

  10. [10]

    In: International Conference on Infor- mation Networking (ICOIN), pp

    Wang, W., Zhu, M., Zeng, X., Ye, X., Sheng, Y.: Malware traffic classification using con- volutional neural network for representation learning. In: International Conference on Infor- mation Networking (ICOIN), pp. 712–717. IEEE (2017)