Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems
Pith reviewed 2026-05-07 15:34 UTC · model grok-4.3
The pith
An off-the-shelf large language model can triage Modbus traffic in energy industrial control systems into normal or critical events with performance comparable to trained supervised detectors and without any task-specific updates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under matched event information and shared evaluation splits, the LLM-based triage pipeline achieves high predictive performance on both benchmarks and is broadly comparable to strong supervised baselines, while requiring no task-specific weight updates. Intervention-based diagnostics provide evidence that the cited tokens are often decision-relevant to the model's own prediction.
What carries the argument
A prompt-configured large language model that receives a compact token string derived from discretized Modbus protocol fields and returns a normal-or-critical decision plus a token-grounded incident record.
Load-bearing premise
Converting Modbus fields into a compact discretized token string preserves enough information for the LLM to make accurate normal-versus-critical decisions.
What would settle it
On a held-out Modbus dataset, if the LLM's accuracy falls substantially below matched supervised baselines or if intervention diagnostics show that the cited tokens are not causally tied to the prediction, the central claim would be falsified.
Figures
read the original abstract
In modern energy systems, industrial control systems (ICS) and power-system SCADA require intrusion detection that is not only accurate but also auditable by operators. The ICS intrusion-detection landscape is currently dominated by established supervised detectors. In this paper, we study whether an off-the-shelf large language model (LLM) can serve as a complementary, human-in-the-loop layer for Modbus traffic. We cast this as a binary network-side normal/critical decision task on two public ICS Modbus datasets, collapsing attack periods and other safety-critical behaviors into a single critical class. Each Modbus communication instance is converted into a compact token string derived from discretized protocol fields, and a prompt-configured LLM produces a normal/critical alert together with a concise, token-grounded incident record for analyst review. Under matched event information and shared evaluation splits, the resulting LLM-based triage pipeline achieves high predictive performance on both benchmarks and is broadly comparable to strong supervised baselines, while requiring no task-specific weight updates. To assess the audit record, we apply intervention-based diagnostics, including sufficiency- and necessity-style tests, which provide evidence that the cited tokens are often decision-relevant to the model's own prediction. These records are intended as audit signals rather than full human-grounded explanations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes using an off-the-shelf LLM as a zero-shot, explainable triage layer for binary normal/critical classification of Modbus traffic in energy ICS/SCADA systems. Modbus instances are converted to compact discretized token strings derived from protocol fields; a prompt-configured LLM outputs the label plus a concise, token-grounded incident record. The approach is evaluated on two public ICS Modbus datasets under matched event information and shared splits, claiming high predictive performance broadly comparable to strong supervised baselines without any task-specific weight updates. Intervention-based diagnostics (sufficiency/necessity tests) are applied to provide evidence that cited tokens are decision-relevant to the model's predictions, positioning the records as audit signals for human analysts.
Significance. If the performance claims hold under the stated matching conditions, the work demonstrates a practical route to training-free, human-auditable detection for critical infrastructure that complements existing supervised detectors. Credit is due for the use of public datasets, the absence of task-specific fitting or invented parameters, and the explicit intervention diagnostics that move beyond post-hoc attribution. The approach directly addresses the auditability requirement highlighted in the ICS security literature.
major comments (3)
- [Abstract and §4] Abstract and §4 (Evaluation): the central claim that the LLM pipeline 'achieves high predictive performance ... and is broadly comparable to strong supervised baselines' is load-bearing yet unsupported by any numeric metrics, confidence intervals, or per-class results in the provided abstract; the full results section must supply these values together with the exact baseline implementations and feature sets to permit verification.
- [§3] §3 (Method, discretization step): the claim of matched event information rests on converting Modbus fields (register values, function codes, addresses) into a compact discretized token string. This quantization necessarily bins or drops fine-grained numeric thresholds and multi-packet timing that supervised baselines receive directly; without an explicit ablation or side-by-side feature-equivalence test, it is unclear whether the LLM input is informationally complete relative to the baselines used for comparison.
- [§5] §5 (Intervention diagnostics): the sufficiency/necessity tests demonstrate that certain tokens influence the LLM's own prediction, but they operate entirely within the discretized token representation. They therefore cannot test whether that representation itself preserves the discriminative information present in the raw Modbus records employed by the supervised baselines.
minor comments (2)
- [§3.2] Clarify in the prompt template (likely §3.2) whether the LLM is instructed to output only the binary label plus cited tokens or whether additional free-form text is permitted; this affects reproducibility of the audit record.
- [Results figures] Table captions and axis labels in the results figures should explicitly state the evaluation split and the precise definition of the 'critical' class (collapsed attacks plus safety-critical behaviors).
Simulated Author's Rebuttal
Thank you for the constructive and detailed review. We appreciate the recognition of the work's focus on training-free, auditable detection for ICS environments and the value placed on public datasets and intervention diagnostics. We address each major comment below, indicating the revisions we will incorporate to improve clarity, completeness, and verifiability.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Evaluation): the central claim that the LLM pipeline 'achieves high predictive performance ... and is broadly comparable to strong supervised baselines' is load-bearing yet unsupported by any numeric metrics, confidence intervals, or per-class results in the provided abstract; the full results section must supply these values together with the exact baseline implementations and feature sets to permit verification.
Authors: We agree that the abstract would benefit from explicit numeric support for the performance claim. In the revised manuscript we will add key metrics (accuracy, F1, per-class precision/recall) with confidence intervals to the abstract. Section 4 will be expanded to report the precise supervised baseline implementations (including libraries, hyperparameters, and training procedures) and the exact feature sets extracted from the raw Modbus records, enabling direct verification and reproduction. revision: yes
-
Referee: [§3] §3 (Method, discretization step): the claim of matched event information rests on converting Modbus fields (register values, function codes, addresses) into a compact discretized token string. This quantization necessarily bins or drops fine-grained numeric thresholds and multi-packet timing that supervised baselines receive directly; without an explicit ablation or side-by-side feature-equivalence test, it is unclear whether the LLM input is informationally complete relative to the baselines used for comparison.
Authors: The discretization was constructed to retain the protocol fields most relevant to the binary normal/critical task while producing compact token strings suitable for prompting. We acknowledge that an explicit ablation would strengthen the matched-information claim. In the revision we will add an ablation study varying discretization granularity (bin widths for register values and address ranges) and report its effect on LLM performance. We will also include a side-by-side mapping table showing how each token corresponds to the raw fields and any timing aggregates supplied to the baselines, together with a discussion of any information loss for multi-packet sequences. revision: yes
-
Referee: [§5] §5 (Intervention diagnostics): the sufficiency/necessity tests demonstrate that certain tokens influence the LLM's own prediction, but they operate entirely within the discretized token representation. They therefore cannot test whether that representation itself preserves the discriminative information present in the raw Modbus records employed by the supervised baselines.
Authors: The intervention tests are scoped to the LLM's internal decision process on the token representation; they are not intended to validate cross-representation equivalence. We will revise §5 to state this limitation explicitly and to clarify that the primary empirical comparison to baselines occurs at the level of task performance under identical event splits. The diagnostics serve to substantiate the audit-record utility rather than to prove representational completeness. We will also note this as a boundary condition for future work that might explore raw-data prompting or hybrid representations. revision: yes
Circularity Check
No significant circularity; empirical evaluation is self-contained
full rationale
The paper describes an empirical pipeline that converts public Modbus datasets into discretized token strings, feeds them to an off-the-shelf LLM via prompting, and reports predictive performance against externally trained supervised baselines on matched splits. No equations, fitted parameters, or derivations are presented that reduce to their own inputs by construction. The intervention diagnostics are applied after the fact to inspect token relevance within the LLM's own representation and do not define or force the reported accuracy numbers. The central claim of comparability without weight updates therefore rests on independent data and external benchmarks rather than self-referential construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Discretized protocol fields converted to token strings preserve sufficient information for accurate normal/critical classification by an LLM.
- domain assumption Intervention-based diagnostics can establish that cited tokens are causally relevant to the model's prediction.
Reference graph
Works this paper leans on
-
[1]
Brown, Benjamin Mann, Nick Ryder, et al
Tom B. Brown, Benjamin Mann, Nick Ryder, et al. 2020. Language Models are Few-Shot Learners. InAdvances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., Red Hook, NY, USA, 1877–1901. doi:10.5555/3495724. 3495881
-
[2]
Canadian Institute for Cybersecurity (UNB). 2023. CIC Modbus dataset 2023. https://www.unb.ca/cic/datasets/modbus-2023.html Accessed: 2025-11-06
2023
-
[3]
Cybersecurity and Infrastructure Security Agency. 2021. Cyber-Attack Against Ukrainian Critical Infrastructure. https://www.cisa.gov/news-events/ics-alerts/ ir-alert-h-16-056-01. ICS Alert IR-ALERT-H-16-056-01, Accessed: 2026-03-27
2021
-
[4]
Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, and Byron C. Wallace. 2020. ERASER: A Benchmark to Evaluate Rationalized NLP Models. InProceedings of the 58th Annual Meeting of the Associ- ation for Computational Linguistics. Association for Computational Linguistics, Online, 4443–4458. doi:10.18653/v1/2020.ac...
-
[5]
Yousif Hosain and Muhammet Çakmak. 2025. XAI-XGBoost: An Innovative Ex- plainable Intrusion Detection Approach for Securing Internet of Medical Things Systems.Scientific Reports15, 1 (2025), 22278. doi:10.1038/s41598-025-07790-0
-
[6]
Yan Hu, An Yang, Hong Li, Yuyan Sun, and Limin Sun. 2018. A Survey of Intrusion Detection on Industrial Control Systems.International Journal of Distributed Sensor Networks14, 8 (2018), 1–13. doi:10.1177/1550147718794615
-
[7]
Alani, Amine Bermak, and Issa Khalil
Naseem Khan, Kashif Ahmad, Aref Al Tamimi, Mohammed M. Alani, Amine Bermak, and Issa Khalil. 2024. Explainable AI-Based Intrusion Detection System for Industry 5.0: An Overview of the Literature, Associated Challenges, the Existing Solutions, and Potential Research Directions. arXiv:2408.03335 https: //arxiv.org/abs/2408.03335
-
[8]
Fernandez
Antoine LeMay and Jose M. Fernandez. 2016. Providing SCADA Network Data Sets for Intrusion Detection Research. In9th Workshop on Cyber Security Exper- imentation and Test (CSET 16). USENIX Association, Austin, TX, USA. https: //www.usenix.org/conference/cset16/workshop-program/presentation/lemay
2016
-
[9]
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing.Comput. Surveys55, 9 (2023), 1–35. doi:10.1145/3560815
-
[10]
Yao Liu, Peng Ning, and Michael K. Reiter. 2009. False Data Injection Attacks against State Estimation in Electric Power Grids. InProceedings of the 16th ACM Conference on Computer and Communications Security (CCS ’09). Association for Computing Machinery, Chicago, IL, USA, 21–32. doi:10.1145/1653662.1653666
-
[11]
OpenAI. 2024. GPT-4o System Card. https://openai.com/index/gpt-4o-system- card/. Accessed: 2026-03-27
2024
- [12]
-
[13]
Muhammad Azmi Umer, Khurum Nazir Junejo, Muhammad Taha Jilani, and Aditya P. Mathur. 2022. Machine Learning for Intrusion Detection in Industrial Control Systems: Applications, Challenges, and Recommendations.International Journal of Critical Infrastructure Protection38 (2022), 100516. doi:10.1016/j.ijcip. 2022.100516
-
[14]
Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan Hines, John P. Dickerson, and Chirag Shah. 2024. Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review.Comput. Surveys56, 12 (2024), 312:1–312:42. doi:10.1145/3677119
-
[15]
Large language models for cyber security: A systematic literature review,
Hanxiang Xu, Shenao Wang, Ningke Li, Kailong Wang, Yanjie Zhao, Kai Chen, Ting Yu, Yang Liu, and Haoyu Wang. 2024. Large Language Models for Cyber Security: A Systematic Literature Review. arXiv:2405.04760 https://arxiv.org/ abs/2405.04760 Appendix: Example Prompts This appendix lists one representative configuration of the prompts used in our experiments...
-
[16]
Does this command merely *observe* the system (Read), or does it attempt to *change* the system (Write/Control)?
**The Principle of Impact:** - Ask yourself: "Does this command merely *observe* the system (Read), or does it attempt to *change* the system (Write/Control)?" - **READ operations** (Polling) are the baseline of industrial automation and are generally **Normal**, unless they occur at an impossible speed (Flooding). - **WRITE/CONTROL operations** actively ...
-
[17]
low-and-slow
**The Principle of Rhythm (Time Analysis):** - Machines are rhythmic; Hackers are bursty. - **Normal:** Periodic, steady inter-arrival times (e.g., regular polling intervals). - **Critical:** Sudden deviations from the rhythm. Extremely short intervals (B0/B1) suggest automated flooding or fuzzing. Extremely long gaps followed by activity may suggest a "l...
-
[18]
- Exception codes (EX) usually indicate a device failure or a scanner probing invalid addresses
**The Principle of Protocol Compliance:** - Any proprietary, undefined, or malformed function codes are immediately **Critical**. - Exception codes (EX) usually indicate a device failure or a scanner probing invalid addresses. ### INPUT FORMAT LEGEND You will receive a single log line with discretized tokens: - **DIR:** Direction (C2S = Client to Server /...
-
[19]
**Semantic Decode:** What is the specific purpose of this FC according to the Modbus protocol standard? (e.g., Is it reading inputs or forcing coils?)
-
[20]
**Intent Check:** Does this combination of Direction + FC + Frequency look like a SCADA master polling a sensor (Benign), or an external actor trying to manipulate the grid (Malicious)? e-Energy ’26, June 2026, Banff, AB, Canada Weiyi Kong, Ahmad Mohammad Saber, Amr Youssef, and Deepa Kundur
2026
-
[21]
label": Must be
**Risk Assessment:** If this command succeeds, could it physically trip a breaker or alter a sensor reading? ### OUTPUT FORMAT Output ONLY a JSON object. - "label": Must be "normal" or "critical". - "confidence": Float between 0.0 and 1.0. - "rationale": A concise, 1-sentence explanation focusing on the *operational impact* (e.g., "Unauthorized attempt to...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.