One Step Further: Understanding PLC Binaries Through Cross-Platform Reverse Engineering and Function-Level Semantic Analysis
Pith reviewed 2026-05-21 08:26 UTC · model grok-4.3
The pith
PLC-BinX recovers function-level semantics from PLC binaries on four platforms to enable perfect toolchain prediction and moderate functionality prediction.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PLC-BinX analyzes PLC binaries from CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3 through cross-platform reverse engineering, core-function extraction, and function-level semantic representation construction. Using the resulting representations under ten-fold program-level evaluation, the system reaches 100.00 percent precision, recall, and F1 in toolchain prediction and 51.43 percent precision, 49.38 percent recall, and 49.18 percent F1 in functionality prediction over 22 labels. The approach therefore exposes task-relevant function-level semantics from heterogeneous PLC binaries in an effective and interpretable form.
What carries the argument
The three-stage workflow of cross-platform reverse engineering, core-function extraction to isolate control logic, and function-level semantic representation construction that produces features for downstream prediction tasks.
If this is right
- Toolchain prediction reaches perfect scores once semantic representations are derived from the extracted core functions.
- Functionality prediction achieves nearly 50 percent F1 across 22 labels, showing that the semantic features capture meaningful program behavior.
- Security analysts obtain an interpretable method for examining PLC binaries during post-attack forensics and incident response.
- Binary analysis for industrial control systems works across multiple vendor platforms without requiring separate tools for each format.
Where Pith is reading between the lines
- Extending core-function extraction to additional PLC platforms could broaden the workflow beyond the four evaluated here.
- Refining the semantic representations might close the gap between perfect toolchain results and the current functionality scores.
- Pairing the static semantic approach with runtime observation of PLC programs could yield more complete behavior models.
- Security systems for industrial control could embed similar function-level extraction to flag unexpected changes in deployed binaries.
Load-bearing premise
The core-function extraction step can reliably separate control logic from runtime code across heterogeneous PLC platforms without losing task-relevant semantics.
What would settle it
A direct comparison on new PLC binaries where manually verified core functions differ substantially from those extracted by the workflow and produce lower prediction scores would show the separation step does not preserve the needed semantics.
Figures
read the original abstract
As emerging attacks increasingly target Industrial Control Systems (ICS), the security of Programmable Logic Controllers (PLCs) has become a critical concern. Binary Code Analysis (BCA), which enables analysts to understand compiled programs without source code, is essential for ICS security tasks such as post-attack digital forensics and incident response. However, automated BCA for PLC binaries remains challenging due to three key issues: heterogeneous binary formats across PLC platforms, entangled program semantics caused by the mixture of control logic with runtime code, and limited semantic representations for interpretable and learning-based downstream analysis. In this paper, we present PLC-BinX, a BCA workflow for cross-platform PLC binary understanding. PLC-BinX analyzes PLC binaries from four platforms: CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3, and recovers function-level information through cross-platform reverse engineering, core-function extraction, and function-level semantic representation construction. Based on the recovered semantic representations, we further study two downstream tasks: toolchain prediction and functionality prediction. Under ten-fold program-level evaluation, PLC-BinX achieves 100.00% precision, recall, and F1 in toolchain prediction, and 51.43% precision, 49.38% recall, and 49.18% F1 in functionality prediction over 22 labels. The results demonstrate that PLC-BinX provides an effective and interpretable approach to cross-platform PLC binary understanding by exposing task-relevant function-level semantics from heterogeneous PLC binaries.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces PLC-BinX, a three-stage binary code analysis workflow for cross-platform PLC binaries from CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3. It performs cross-platform reverse engineering, core-function extraction to separate control logic from runtime code, and constructs function-level semantic representations. These are applied to two downstream tasks—toolchain prediction and functionality prediction over 22 labels—under ten-fold program-level evaluation, reporting 100% precision/recall/F1 for toolchain prediction and 51.43%/49.38%/49.18% for functionality prediction. The work targets challenges in heterogeneous formats, entangled semantics, and limited representations for ICS security applications such as forensics.
Significance. If the central claims hold, this work offers a practical and interpretable advance in automated analysis of PLC binaries, which is valuable for industrial control system security, post-attack digital forensics, and incident response. The perfect toolchain prediction score indicates that the recovered semantic representations effectively capture platform-discriminative features, while the moderate functionality prediction performance is consistent with the difficulty of a 22-label task. The explicit ten-fold program-level evaluation and emphasis on function-level semantics rather than whole-binary features are methodological strengths that enhance reproducibility and utility.
major comments (1)
- [§3.2] §3.2 (core-function extraction within the three-stage workflow): the manuscript asserts that this step reliably isolates control logic from runtime code across the four heterogeneous platforms without losing task-relevant semantics, yet supplies no quantitative fidelity metric, inter-annotator agreement, manual validation results, or ablation that isolates its contribution. Because both the 100% toolchain and ~49% functionality results rest directly on the quality of these extracted functions, the absence of such validation leaves the cross-platform generalizability of the reported performance unconfirmed.
minor comments (2)
- [Abstract] Abstract and evaluation section: the reported performance figures omit essential dataset details including total number of binaries/programs, per-platform counts, label distribution across the 22 functionality classes, any post-hoc data exclusions, and whether error bars or statistical tests accompany the F1 scores.
- [Evaluation] Evaluation protocol: clarify exactly how the ten-fold program-level split was constructed (e.g., whether entire programs or individual binaries were held out, and how platform balance was maintained across folds) to allow readers to assess potential leakage or bias.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the potential value of PLC-BinX for ICS security applications. We address the single major comment below and will incorporate the requested validation in the revised manuscript.
read point-by-point responses
-
Referee: [§3.2] §3.2 (core-function extraction within the three-stage workflow): the manuscript asserts that this step reliably isolates control logic from runtime code across the four heterogeneous platforms without losing task-relevant semantics, yet supplies no quantitative fidelity metric, inter-annotator agreement, manual validation results, or ablation that isolates its contribution. Because both the 100% toolchain and ~49% functionality results rest directly on the quality of these extracted functions, the absence of such validation leaves the cross-platform generalizability of the reported performance unconfirmed.
Authors: We agree that an explicit quantitative validation of the core-function extraction step would strengthen the claims. The extraction procedure in §3.2 relies on platform-specific heuristics derived from the known runtime structures of CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3 (e.g., identification of entry-point functions and exclusion of standard library call-graph components). While the perfect toolchain-prediction results provide indirect evidence that the extracted functions retain platform-discriminative semantics, we acknowledge the absence of direct fidelity metrics. In the revised manuscript we will add (i) a manual validation on a randomly sampled subset of 50 binaries per platform, reporting precision and recall of extracted control-logic functions against human-annotated ground truth, (ii) inter-annotator agreement statistics (Cohen’s kappa) for the annotations, and (iii) an ablation study that compares downstream prediction performance with and without the core-function extraction step. These additions will be placed in a new subsection of §3.2 and will be reflected in the experimental results of §4. revision: yes
Circularity Check
No circularity: empirical results from explicit cross-validation on held-out binaries
full rationale
The paper describes an engineering workflow (reverse engineering, core-function extraction, semantic representation) followed by two downstream ML tasks whose performance is measured via ten-fold program-level evaluation on collected binaries. These metrics are computed on held-out folds rather than reducing by construction to any fitted parameter, self-defined quantity, or self-citation chain. No equations or uniqueness theorems are invoked that would make the reported precision/recall equivalent to the input data by definition; the central claims remain independent empirical observations.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Binaries from CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3 contain recoverable function-level control logic that can be separated from runtime code.
Reference graph
Works this paper leans on
-
[1]
D. E. Denning, “Stuxnet: What has changed?”Future Internet, vol. 4, no. 3, pp. 672–687, 2012
work page 2012
-
[2]
Backdooring CODESYS applications via vulnerability chaining,
Nozomi Networks Labs, “Backdooring CODESYS applications via vulnerability chaining,” Nozomi Networks Blog, Apr. 2026, accessed: 2026-05-08. [Online]. Available: https://www.nozominetworks.com/ blog/backdooring-codesys-applications-via-vulnerability-chaining
work page 2026
-
[3]
Y . G. Achamyeleh, S.-Y . Yu, G. Q. Araya, and M. A. Al Faruque, “Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, 2025, pp. 5260–5269
work page 2025
-
[4]
ICSREF: A framework for automated reverse engineering of industrial control systems binaries,
A. Keliris and M. Maniatakos, “ICSREF: A framework for automated reverse engineering of industrial control systems binaries,” in26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society, 2019
work page 2019
-
[5]
IEC 61131-3 programming languages,
PLCopen, “IEC 61131-3 programming languages,” Technical documen- tation, 2013. [Online]. Available: https://plcopen.org/iec-61131-3
work page 2013
-
[6]
CODESYS GmbH, “Generating application code,” https: //content.helpme-codesys.com/en/CODESYS%20Development% 20System/ cds creating application code for plc.html, accessed: 2026-05-08
work page 2026
-
[7]
CODESYS, “Creating a boot application,” https://content. helpme-codesys.com/en/CODESYS%20Development%20System/ cds creating a boot application.html, accessed: 2026-05-08
work page 2026
-
[8]
OpenPLC, “MATIEC: IEC 61131-3 compiler,” https://openplcproject. gitlab.io/matiec/, accessed: 2026-05-08
work page 2026
-
[9]
The Future of Industrial Automation,
AUTONOMY, “The Future of Industrial Automation,” https:// autonomylogic.com/, accessed: 2026-05-11
work page 2026
-
[10]
Capstone: Next-gen disassembly framework,
N. A. Quynh, “Capstone: Next-gen disassembly framework,”Black Hat USA, vol. 5, no. 2, pp. 3–8, 2014
work page 2014
-
[11]
IDA Pro disassembler and debugger,
Hex-Rays, “IDA Pro disassembler and debugger,” Software, 2024. [Online]. Available: https://hex-rays.com/ida-pro/ 14
work page 2024
-
[12]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 5998–6008
work page 2017
-
[13]
Inductive representation learning on large graphs,
W. L. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 1024–1034
work page 2017
-
[14]
Ghidra software reverse engineering framework,
National Security Agency, “Ghidra software reverse engineering framework,” Software, 2019. [Online]. Available: https://ghidra-sre.org/
work page 2019
-
[15]
Sok:(state of) the art of war: Offensive techniques in binary analysis,
Y . Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegelet al., “Sok:(state of) the art of war: Offensive techniques in binary analysis,” in2016 IEEE symposium on security and privacy (SP). IEEE, 2016, pp. 138–157
work page 2016
-
[16]
An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,
D. Andriesse, X. Chen, V . Van Der Veen, A. Slowinska, and H. Bos, “An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,” in25th USENIX security symposium (USENIX security 16), 2016, pp. 583–600
work page 2016
-
[17]
XDA: accurate, robust disassembly with transfer learning,
K. Pei, J. Guan, D. Williams-King, J. Yang, and S. Jana, “XDA: accurate, robust disassembly with transfer learning,” in28th Annual Network and Distributed System Security Symposium, NDSS 2021, virtually, February 21-25, 2021. The Internet Society, 2021
work page 2021
-
[18]
Denial of engineering operations attacks in industrial control systems,
S. Senthivel, S. Dhungana, H. Yoo, I. Ahmed, and V . Roussev, “Denial of engineering operations attacks in industrial control systems,” in Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, 2018, pp. 319–329
work page 2018
-
[19]
Automated reconstruction of control logic for programmable logic controller forensics,
S. A. Qasim, J. Lopez Jr, and I. Ahmed, “Automated reconstruction of control logic for programmable logic controller forensics,” inIn- ternational Conference on Information Security. Springer, 2019, pp. 402–422
work page 2019
-
[20]
From control application to control logic: Plc decompile framework for industrial control system,
C. Sang, J. Wu, J. Li, and M. Guizani, “From control application to control logic: Plc decompile framework for industrial control system,” IEEE Transactions on Information Forensics and Security, vol. 19, pp. 8685–8700, 2024
work page 2024
-
[21]
Y . Geng, X. Che, R. Ma, Q. Wei, M. Wang, and Y . Chen, “Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,”IEEE Internet of Things Journal, vol. 11, no. 5, pp. 8386–8400, 2023
work page 2023
-
[22]
H. Benkraouda, A. Agrawal, D. Tychalas, M. Sazos, and M. Maniatakos, “Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,” inProceedings of the 5th Workshop on CPS&IoT Security and Privacy, 2023, pp. 83–89
work page 2023
-
[23]
A survey of binary code similarity,
I. U. Haq and J. Caballero, “A survey of binary code similarity,”Acm computing surveys (csur), vol. 54, no. 3, pp. 1–38, 2021
work page 2021
-
[24]
How machine learning is solving the binary function similarity problem,
A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y . Fratantonio, M. Man- souri, and D. Balzarotti, “How machine learning is solving the binary function similarity problem,” in31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 2099–2116
work page 2022
-
[25]
Discovre: Efficient cross-architecture identification of bugs in binary code
S. Eschweiler, K. Yakdan, E. Gerhards-Padillaet al., “Discovre: Efficient cross-architecture identification of bugs in binary code.” inNdss, vol. 52, 2016, pp. 58–79
work page 2016
-
[26]
Scalable graph-based bug search for firmware images,
Q. Feng, R. Zhou, C. Xu, Y . Cheng, B. Testa, and H. Yin, “Scalable graph-based bug search for firmware images,” inProceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 480–491
work page 2016
-
[27]
Neural network- based graph embedding for cross-platform binary code similarity detec- tion,
X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, “Neural network- based graph embedding for cross-platform binary code similarity detec- tion,” inProceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 363–376
work page 2017
-
[28]
Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,
J. Gao, X. Yang, Y . Fu, Y . Jiang, and J. Sun, “Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,” inPro- ceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 896–899
work page 2018
-
[29]
Order matters: Semantic-aware neural networks for binary code similarity detection,
Z. Yu, R. Cao, Q. Tang, S. Nie, J. Huang, and S. Wu, “Order matters: Semantic-aware neural networks for binary code similarity detection,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 01, 2020, pp. 1145–1152
work page 2020
-
[30]
Deepbindiff: Learning program- wide code representations for binary diffing,
Y . Duan, X. Li, J. Wang, and H. Yin, “Deepbindiff: Learning program- wide code representations for binary diffing,” in27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020. The Internet Society, 2020
work page 2020
-
[31]
Safe: Self-attentive function embeddings for binary similarity,
L. Massarelli, G. A. Di Luna, F. Petroni, R. Baldoni, and L. Querzoni, “Safe: Self-attentive function embeddings for binary similarity,” in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 2019, pp. 309–329
work page 2019
-
[32]
S. H. Ding, B. C. Fung, and P. Charland, “Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,” in2019 ieee symposium on security and privacy (sp). IEEE, 2019, pp. 472–489
work page 2019
-
[33]
Palmtree: Learning an assembly language model for instruction embedding,
X. Li, Y . Qu, and H. Yin, “Palmtree: Learning an assembly language model for instruction embedding,” inProceedings of the 2021 ACM SIGSAC conference on computer and communications security, 2021, pp. 3236–3251
work page 2021
-
[34]
Jtrans: Jump-aware transformer for binary code similarity detection,
H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang, “Jtrans: Jump-aware transformer for binary code similarity detection,” inProceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis, 2022, pp. 1–13
work page 2022
-
[35]
Trex: Learning execution semantics from micro-traces for binary similarity,
K. Pei, Z. Xuan, J. Yang, S. Jana, and B. Ray, “Trex: Learning execution semantics from micro-traces for binary similarity,”arXiv preprint arXiv:2012.08680, 2020
-
[36]
1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,
A. Jia, M. Fan, W. Jin, X. Xu, Z. Zhou, Q. Tang, S. Nie, S. Wu, and T. Liu, “1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,”ACM Transactions on Software Engineering and Methodology, vol. 32, no. 4, pp. 1–26, 2023
work page 2023
-
[37]
Cross-inlining binary function similarity detection,
A. Jia, M. Fan, X. Xu, W. Jin, H. Wang, and T. Liu, “Cross-inlining binary function similarity detection,” inProceedings of the IEEE/ACM 46th international conference on software engineering, 2024, pp. 1–13
work page 2024
-
[38]
D. Tychalas, H. Benkraouda, and M. Maniatakos, “{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,” in30th USENIX Security Sym- posium (USENIX Security 21), 2021, pp. 2847–2862
work page 2021
-
[39]
Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,
A. Bytes, P. H. N. Rajput, C. Doumanidis, M. Maniatakos, J. Zhou, and N. O. Tippenhauer, “Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,” inProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses, 2023, pp. 499–512
work page 2023
-
[40]
Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems
C. Villa, C. Doumanidis, H. Lamri, P. H. N. Rajput, and M. Maniatakos, “Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems.” inNDSS, 2025
work page 2025
-
[41]
Plcverif: Status of a formal verification tool for programmable logic controller,
I. D. Lopez-Miguel, J.-C. Tournier, and B. F. Adiego, “Plcverif: Status of a formal verification tool for programmable logic controller,”arXiv preprint arXiv:2203.17253, 2022
-
[42]
Binary-level formal verification based automatic security ensurement for plc in industrial iot,
X. Zhang, J. Li, J. Wu, G. Chen, Y . Meng, H. Zhu, and X. Zhang, “Binary-level formal verification based automatic security ensurement for plc in industrial iot,”IEEE Transactions on Dependable and Secure Computing, vol. 22, no. 3, pp. 2211–2226, 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.