One Step Further: Understanding PLC Binaries Through Cross-Platform Reverse Engineering and Function-Level Semantic Analysis

Ang Jia; He Jiang; Xiaochen LI; Yaxin Duan; Zhenzhou Tian; Zhilei Ren

arxiv: 2605.17392 · v2 · pith:XP3V23ZZnew · submitted 2026-05-17 · 💻 cs.SE

One Step Further: Understanding PLC Binaries Through Cross-Platform Reverse Engineering and Function-Level Semantic Analysis

Ang Jia , Yaxin Duan , He Jiang , Zhenzhou Tian , Zhilei Ren , Xiaochen Li This is my paper

Pith reviewed 2026-05-21 08:26 UTC · model grok-4.3

classification 💻 cs.SE

keywords PLC binary analysiscross-platform reverse engineeringfunction-level semanticsindustrial control systems securitytoolchain predictionfunctionality predictionbinary code analysiscore-function extraction

0 comments

The pith

PLC-BinX recovers function-level semantics from PLC binaries on four platforms to enable perfect toolchain prediction and moderate functionality prediction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents PLC-BinX as a workflow for binary code analysis of Programmable Logic Controllers used in industrial systems. It targets three difficulties that block automated understanding of compiled PLC programs: different binary formats across platforms, the mixing of control instructions with supporting runtime code, and the lack of clear semantic features for further analysis. The workflow first applies cross-platform reverse engineering, then isolates core functions that carry the main control logic, and finally builds semantic representations of those functions. These representations support two downstream tasks where the system identifies the exact toolchain that produced each binary with complete accuracy and assigns functionality labels with roughly half the possible accuracy across twenty-two categories. The results matter for security work because they give analysts a way to examine PLC binaries after attacks without source code.

Core claim

PLC-BinX analyzes PLC binaries from CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3 through cross-platform reverse engineering, core-function extraction, and function-level semantic representation construction. Using the resulting representations under ten-fold program-level evaluation, the system reaches 100.00 percent precision, recall, and F1 in toolchain prediction and 51.43 percent precision, 49.38 percent recall, and 49.18 percent F1 in functionality prediction over 22 labels. The approach therefore exposes task-relevant function-level semantics from heterogeneous PLC binaries in an effective and interpretable form.

What carries the argument

The three-stage workflow of cross-platform reverse engineering, core-function extraction to isolate control logic, and function-level semantic representation construction that produces features for downstream prediction tasks.

If this is right

Toolchain prediction reaches perfect scores once semantic representations are derived from the extracted core functions.
Functionality prediction achieves nearly 50 percent F1 across 22 labels, showing that the semantic features capture meaningful program behavior.
Security analysts obtain an interpretable method for examining PLC binaries during post-attack forensics and incident response.
Binary analysis for industrial control systems works across multiple vendor platforms without requiring separate tools for each format.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending core-function extraction to additional PLC platforms could broaden the workflow beyond the four evaluated here.
Refining the semantic representations might close the gap between perfect toolchain results and the current functionality scores.
Pairing the static semantic approach with runtime observation of PLC programs could yield more complete behavior models.
Security systems for industrial control could embed similar function-level extraction to flag unexpected changes in deployed binaries.

Load-bearing premise

The core-function extraction step can reliably separate control logic from runtime code across heterogeneous PLC platforms without losing task-relevant semantics.

What would settle it

A direct comparison on new PLC binaries where manually verified core functions differ substantially from those extracted by the workflow and produce lower prediction scores would show the separation step does not preserve the needed semantics.

Figures

Figures reproduced from arXiv: 2605.17392 by Ang Jia, He Jiang, Xiaochen LI, Yaxin Duan, Zhenzhou Tian, Zhilei Ren.

**Figure 1.** Figure 1: shows how PLCs connect supervisory components, field devices, and physical processes in a typical ICS. PLCs receive sensor inputs from the field, execute control logic, and output commands to actuators such as motors, valves, pumps, and relays. In a typical deployment, supervisory components such as engineering workstations, Human–Machine Interfaces (HMIs), or SCADA servers configure and monitor the contro… view at source ↗

**Figure 2.** Figure 2: Example of PLC source organization using POUs. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the PLC-BinX workflow. TABLE I PLC PLATFORMS AND BINARY FORMS IN THE DATASET. Platform Binary form # Binaries CODESYS v3 [7] .app container 555 GEB [3] ARM ELF executable 617 OpenPLC v2 [9] PE executable 619 OpenPLC v3 [9] PE executable 640 Total – 2,431 A. Input PLC Binaries We use PLC-BEAD [3] as the dataset for our study. To the best of our knowledge, PLC-BEAD is the only publicly available… view at source ↗

**Figure 4.** Figure 4: Workflow of PLC-BinX for toolchain prediction. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Workflow of PLC-BinX for functionality prediction. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

As emerging attacks increasingly target Industrial Control Systems (ICS), the security of Programmable Logic Controllers (PLCs) has become a critical concern. Binary Code Analysis (BCA), which enables analysts to understand compiled programs without source code, is essential for ICS security tasks such as post-attack digital forensics and incident response. However, automated BCA for PLC binaries remains challenging due to three key issues: heterogeneous binary formats across PLC platforms, entangled program semantics caused by the mixture of control logic with runtime code, and limited semantic representations for interpretable and learning-based downstream analysis. In this paper, we present PLC-BinX, a BCA workflow for cross-platform PLC binary understanding. PLC-BinX analyzes PLC binaries from four platforms: CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3, and recovers function-level information through cross-platform reverse engineering, core-function extraction, and function-level semantic representation construction. Based on the recovered semantic representations, we further study two downstream tasks: toolchain prediction and functionality prediction. Under ten-fold program-level evaluation, PLC-BinX achieves 100.00% precision, recall, and F1 in toolchain prediction, and 51.43% precision, 49.38% recall, and 49.18% F1 in functionality prediction over 22 labels. The results demonstrate that PLC-BinX provides an effective and interpretable approach to cross-platform PLC binary understanding by exposing task-relevant function-level semantics from heterogeneous PLC binaries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PLC-BinX gives a workable cross-platform pipeline for PLC binaries with perfect toolchain prediction but leaves the core-function extraction step without quantitative checks.

read the letter

The paper's main contribution is a three-stage workflow called PLC-BinX that takes binaries from CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3, pulls out core functions, builds function-level semantic representations, and feeds them into toolchain and functionality predictors. This combination for PLCs is new relative to the prior work cited in the abstract. It does well by running a ten-fold program-level evaluation and reporting exact numbers: 100% precision, recall, and F1 on toolchain prediction, plus 51.43% precision, 49.38% recall, and 49.18% F1 on functionality across 22 labels. Those results are concrete and directly relevant to digital forensics in industrial control systems where source code is unavailable. The evaluation uses explicit program-level splits rather than parameter fitting, which keeps the claims non-circular. The soft spot is the core-function extraction step. The abstract describes it as isolating control logic from runtime code without semantic loss, but supplies no fidelity metric, ablation, or inter-annotator check to confirm the separation holds across the four platforms. If that step introduces platform-specific noise or drops task-relevant details, the downstream semantic representations and the reported scores become harder to trust. The functionality prediction numbers around 50% are consistent with the claim but also show the representations still have room to improve. Dataset size and label construction details are not in the abstract, though the full text may clarify them. This work is for researchers focused on binary analysis and security for PLCs and similar embedded industrial systems. A reader looking for a practical starting point on cross-platform reverse engineering would find the pipeline and numbers useful. It deserves peer review because it addresses a real gap with measurable results on actual platforms, even if the extraction validation needs strengthening.

Referee Report

1 major / 2 minor

Summary. The paper introduces PLC-BinX, a three-stage binary code analysis workflow for cross-platform PLC binaries from CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3. It performs cross-platform reverse engineering, core-function extraction to separate control logic from runtime code, and constructs function-level semantic representations. These are applied to two downstream tasks—toolchain prediction and functionality prediction over 22 labels—under ten-fold program-level evaluation, reporting 100% precision/recall/F1 for toolchain prediction and 51.43%/49.38%/49.18% for functionality prediction. The work targets challenges in heterogeneous formats, entangled semantics, and limited representations for ICS security applications such as forensics.

Significance. If the central claims hold, this work offers a practical and interpretable advance in automated analysis of PLC binaries, which is valuable for industrial control system security, post-attack digital forensics, and incident response. The perfect toolchain prediction score indicates that the recovered semantic representations effectively capture platform-discriminative features, while the moderate functionality prediction performance is consistent with the difficulty of a 22-label task. The explicit ten-fold program-level evaluation and emphasis on function-level semantics rather than whole-binary features are methodological strengths that enhance reproducibility and utility.

major comments (1)

[§3.2] §3.2 (core-function extraction within the three-stage workflow): the manuscript asserts that this step reliably isolates control logic from runtime code across the four heterogeneous platforms without losing task-relevant semantics, yet supplies no quantitative fidelity metric, inter-annotator agreement, manual validation results, or ablation that isolates its contribution. Because both the 100% toolchain and ~49% functionality results rest directly on the quality of these extracted functions, the absence of such validation leaves the cross-platform generalizability of the reported performance unconfirmed.

minor comments (2)

[Abstract] Abstract and evaluation section: the reported performance figures omit essential dataset details including total number of binaries/programs, per-platform counts, label distribution across the 22 functionality classes, any post-hoc data exclusions, and whether error bars or statistical tests accompany the F1 scores.
[Evaluation] Evaluation protocol: clarify exactly how the ten-fold program-level split was constructed (e.g., whether entire programs or individual binaries were held out, and how platform balance was maintained across folds) to allow readers to assess potential leakage or bias.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential value of PLC-BinX for ICS security applications. We address the single major comment below and will incorporate the requested validation in the revised manuscript.

read point-by-point responses

Referee: [§3.2] §3.2 (core-function extraction within the three-stage workflow): the manuscript asserts that this step reliably isolates control logic from runtime code across the four heterogeneous platforms without losing task-relevant semantics, yet supplies no quantitative fidelity metric, inter-annotator agreement, manual validation results, or ablation that isolates its contribution. Because both the 100% toolchain and ~49% functionality results rest directly on the quality of these extracted functions, the absence of such validation leaves the cross-platform generalizability of the reported performance unconfirmed.

Authors: We agree that an explicit quantitative validation of the core-function extraction step would strengthen the claims. The extraction procedure in §3.2 relies on platform-specific heuristics derived from the known runtime structures of CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3 (e.g., identification of entry-point functions and exclusion of standard library call-graph components). While the perfect toolchain-prediction results provide indirect evidence that the extracted functions retain platform-discriminative semantics, we acknowledge the absence of direct fidelity metrics. In the revised manuscript we will add (i) a manual validation on a randomly sampled subset of 50 binaries per platform, reporting precision and recall of extracted control-logic functions against human-annotated ground truth, (ii) inter-annotator agreement statistics (Cohen’s kappa) for the annotations, and (iii) an ablation study that compares downstream prediction performance with and without the core-function extraction step. These additions will be placed in a new subsection of §3.2 and will be reflected in the experimental results of §4. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results from explicit cross-validation on held-out binaries

full rationale

The paper describes an engineering workflow (reverse engineering, core-function extraction, semantic representation) followed by two downstream ML tasks whose performance is measured via ten-fold program-level evaluation on collected binaries. These metrics are computed on held-out folds rather than reducing by construction to any fitted parameter, self-defined quantity, or self-citation chain. No equations or uniqueness theorems are invoked that would make the reported precision/recall equivalent to the input data by definition; the central claims remain independent empirical observations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard reverse-engineering assumptions and the domain premise that the four chosen PLC platforms share enough structure for cross-platform techniques to apply. No free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Binaries from CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3 contain recoverable function-level control logic that can be separated from runtime code.
Invoked when the paper states that core-function extraction works across these heterogeneous formats.

pith-pipeline@v0.9.0 · 5818 in / 1285 out tokens · 37703 ms · 2026-05-21T08:26:09.007797+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

[1]

Stuxnet: What has changed?

D. E. Denning, “Stuxnet: What has changed?”Future Internet, vol. 4, no. 3, pp. 672–687, 2012

work page 2012
[2]

Backdooring CODESYS applications via vulnerability chaining,

Nozomi Networks Labs, “Backdooring CODESYS applications via vulnerability chaining,” Nozomi Networks Blog, Apr. 2026, accessed: 2026-05-08. [Online]. Available: https://www.nozominetworks.com/ blog/backdooring-codesys-applications-via-vulnerability-chaining

work page 2026
[3]

Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,

Y . G. Achamyeleh, S.-Y . Yu, G. Q. Araya, and M. A. Al Faruque, “Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, 2025, pp. 5260–5269

work page 2025
[4]

ICSREF: A framework for automated reverse engineering of industrial control systems binaries,

A. Keliris and M. Maniatakos, “ICSREF: A framework for automated reverse engineering of industrial control systems binaries,” in26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society, 2019

work page 2019
[5]

IEC 61131-3 programming languages,

PLCopen, “IEC 61131-3 programming languages,” Technical documen- tation, 2013. [Online]. Available: https://plcopen.org/iec-61131-3

work page 2013
[6]

Generating application code,

CODESYS GmbH, “Generating application code,” https: //content.helpme-codesys.com/en/CODESYS%20Development% 20System/ cds creating application code for plc.html, accessed: 2026-05-08

work page 2026
[7]

Creating a boot application,

CODESYS, “Creating a boot application,” https://content. helpme-codesys.com/en/CODESYS%20Development%20System/ cds creating a boot application.html, accessed: 2026-05-08

work page 2026
[8]

MATIEC: IEC 61131-3 compiler,

OpenPLC, “MATIEC: IEC 61131-3 compiler,” https://openplcproject. gitlab.io/matiec/, accessed: 2026-05-08

work page 2026
[9]

The Future of Industrial Automation,

AUTONOMY, “The Future of Industrial Automation,” https:// autonomylogic.com/, accessed: 2026-05-11

work page 2026
[10]

Capstone: Next-gen disassembly framework,

N. A. Quynh, “Capstone: Next-gen disassembly framework,”Black Hat USA, vol. 5, no. 2, pp. 3–8, 2014

work page 2014
[11]

IDA Pro disassembler and debugger,

Hex-Rays, “IDA Pro disassembler and debugger,” Software, 2024. [Online]. Available: https://hex-rays.com/ida-pro/ 14

work page 2024
[12]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 5998–6008

work page 2017
[13]

Inductive representation learning on large graphs,

W. L. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 1024–1034

work page 2017
[14]

Ghidra software reverse engineering framework,

National Security Agency, “Ghidra software reverse engineering framework,” Software, 2019. [Online]. Available: https://ghidra-sre.org/

work page 2019
[15]

Sok:(state of) the art of war: Offensive techniques in binary analysis,

Y . Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegelet al., “Sok:(state of) the art of war: Offensive techniques in binary analysis,” in2016 IEEE symposium on security and privacy (SP). IEEE, 2016, pp. 138–157

work page 2016
[16]

An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,

D. Andriesse, X. Chen, V . Van Der Veen, A. Slowinska, and H. Bos, “An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,” in25th USENIX security symposium (USENIX security 16), 2016, pp. 583–600

work page 2016
[17]

XDA: accurate, robust disassembly with transfer learning,

K. Pei, J. Guan, D. Williams-King, J. Yang, and S. Jana, “XDA: accurate, robust disassembly with transfer learning,” in28th Annual Network and Distributed System Security Symposium, NDSS 2021, virtually, February 21-25, 2021. The Internet Society, 2021

work page 2021
[18]

Denial of engineering operations attacks in industrial control systems,

S. Senthivel, S. Dhungana, H. Yoo, I. Ahmed, and V . Roussev, “Denial of engineering operations attacks in industrial control systems,” in Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, 2018, pp. 319–329

work page 2018
[19]

Automated reconstruction of control logic for programmable logic controller forensics,

S. A. Qasim, J. Lopez Jr, and I. Ahmed, “Automated reconstruction of control logic for programmable logic controller forensics,” inIn- ternational Conference on Information Security. Springer, 2019, pp. 402–422

work page 2019
[20]

From control application to control logic: Plc decompile framework for industrial control system,

C. Sang, J. Wu, J. Li, and M. Guizani, “From control application to control logic: Plc decompile framework for industrial control system,” IEEE Transactions on Information Forensics and Security, vol. 19, pp. 8685–8700, 2024

work page 2024
[21]

Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,

Y . Geng, X. Che, R. Ma, Q. Wei, M. Wang, and Y . Chen, “Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,”IEEE Internet of Things Journal, vol. 11, no. 5, pp. 8386–8400, 2023

work page 2023
[22]

Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,

H. Benkraouda, A. Agrawal, D. Tychalas, M. Sazos, and M. Maniatakos, “Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,” inProceedings of the 5th Workshop on CPS&IoT Security and Privacy, 2023, pp. 83–89

work page 2023
[23]

A survey of binary code similarity,

I. U. Haq and J. Caballero, “A survey of binary code similarity,”Acm computing surveys (csur), vol. 54, no. 3, pp. 1–38, 2021

work page 2021
[24]

How machine learning is solving the binary function similarity problem,

A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y . Fratantonio, M. Man- souri, and D. Balzarotti, “How machine learning is solving the binary function similarity problem,” in31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 2099–2116

work page 2022
[25]

Discovre: Efficient cross-architecture identification of bugs in binary code

S. Eschweiler, K. Yakdan, E. Gerhards-Padillaet al., “Discovre: Efficient cross-architecture identification of bugs in binary code.” inNdss, vol. 52, 2016, pp. 58–79

work page 2016
[26]

Scalable graph-based bug search for firmware images,

Q. Feng, R. Zhou, C. Xu, Y . Cheng, B. Testa, and H. Yin, “Scalable graph-based bug search for firmware images,” inProceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 480–491

work page 2016
[27]

Neural network- based graph embedding for cross-platform binary code similarity detec- tion,

X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, “Neural network- based graph embedding for cross-platform binary code similarity detec- tion,” inProceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 363–376

work page 2017
[28]

Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,

J. Gao, X. Yang, Y . Fu, Y . Jiang, and J. Sun, “Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,” inPro- ceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 896–899

work page 2018
[29]

Order matters: Semantic-aware neural networks for binary code similarity detection,

Z. Yu, R. Cao, Q. Tang, S. Nie, J. Huang, and S. Wu, “Order matters: Semantic-aware neural networks for binary code similarity detection,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 01, 2020, pp. 1145–1152

work page 2020
[30]

Deepbindiff: Learning program- wide code representations for binary diffing,

Y . Duan, X. Li, J. Wang, and H. Yin, “Deepbindiff: Learning program- wide code representations for binary diffing,” in27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020. The Internet Society, 2020

work page 2020
[31]

Safe: Self-attentive function embeddings for binary similarity,

L. Massarelli, G. A. Di Luna, F. Petroni, R. Baldoni, and L. Querzoni, “Safe: Self-attentive function embeddings for binary similarity,” in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 2019, pp. 309–329

work page 2019
[32]

Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,

S. H. Ding, B. C. Fung, and P. Charland, “Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,” in2019 ieee symposium on security and privacy (sp). IEEE, 2019, pp. 472–489

work page 2019
[33]

Palmtree: Learning an assembly language model for instruction embedding,

X. Li, Y . Qu, and H. Yin, “Palmtree: Learning an assembly language model for instruction embedding,” inProceedings of the 2021 ACM SIGSAC conference on computer and communications security, 2021, pp. 3236–3251

work page 2021
[34]

Jtrans: Jump-aware transformer for binary code similarity detection,

H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang, “Jtrans: Jump-aware transformer for binary code similarity detection,” inProceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis, 2022, pp. 1–13

work page 2022
[35]

Trex: Learning execution semantics from micro-traces for binary similarity,

K. Pei, Z. Xuan, J. Yang, S. Jana, and B. Ray, “Trex: Learning execution semantics from micro-traces for binary similarity,”arXiv preprint arXiv:2012.08680, 2020

work page arXiv 2012
[36]

1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,

A. Jia, M. Fan, W. Jin, X. Xu, Z. Zhou, Q. Tang, S. Nie, S. Wu, and T. Liu, “1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,”ACM Transactions on Software Engineering and Methodology, vol. 32, no. 4, pp. 1–26, 2023

work page 2023
[37]

Cross-inlining binary function similarity detection,

A. Jia, M. Fan, X. Xu, W. Jin, H. Wang, and T. Liu, “Cross-inlining binary function similarity detection,” inProceedings of the IEEE/ACM 46th international conference on software engineering, 2024, pp. 1–13

work page 2024
[38]

{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,

D. Tychalas, H. Benkraouda, and M. Maniatakos, “{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,” in30th USENIX Security Sym- posium (USENIX Security 21), 2021, pp. 2847–2862

work page 2021
[39]

Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,

A. Bytes, P. H. N. Rajput, C. Doumanidis, M. Maniatakos, J. Zhou, and N. O. Tippenhauer, “Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,” inProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses, 2023, pp. 499–512

work page 2023
[40]

Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems

C. Villa, C. Doumanidis, H. Lamri, P. H. N. Rajput, and M. Maniatakos, “Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems.” inNDSS, 2025

work page 2025
[41]

Plcverif: Status of a formal verification tool for programmable logic controller,

I. D. Lopez-Miguel, J.-C. Tournier, and B. F. Adiego, “Plcverif: Status of a formal verification tool for programmable logic controller,”arXiv preprint arXiv:2203.17253, 2022

work page arXiv 2022
[42]

Binary-level formal verification based automatic security ensurement for plc in industrial iot,

X. Zhang, J. Li, J. Wu, G. Chen, Y . Meng, H. Zhu, and X. Zhang, “Binary-level formal verification based automatic security ensurement for plc in industrial iot,”IEEE Transactions on Dependable and Secure Computing, vol. 22, no. 3, pp. 2211–2226, 2024

work page 2024

[1] [1]

Stuxnet: What has changed?

D. E. Denning, “Stuxnet: What has changed?”Future Internet, vol. 4, no. 3, pp. 672–687, 2012

work page 2012

[2] [2]

Backdooring CODESYS applications via vulnerability chaining,

Nozomi Networks Labs, “Backdooring CODESYS applications via vulnerability chaining,” Nozomi Networks Blog, Apr. 2026, accessed: 2026-05-08. [Online]. Available: https://www.nozominetworks.com/ blog/backdooring-codesys-applications-via-vulnerability-chaining

work page 2026

[3] [3]

Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,

Y . G. Achamyeleh, S.-Y . Yu, G. Q. Araya, and M. A. Al Faruque, “Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, 2025, pp. 5260–5269

work page 2025

[4] [4]

ICSREF: A framework for automated reverse engineering of industrial control systems binaries,

A. Keliris and M. Maniatakos, “ICSREF: A framework for automated reverse engineering of industrial control systems binaries,” in26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society, 2019

work page 2019

[5] [5]

IEC 61131-3 programming languages,

PLCopen, “IEC 61131-3 programming languages,” Technical documen- tation, 2013. [Online]. Available: https://plcopen.org/iec-61131-3

work page 2013

[6] [6]

Generating application code,

CODESYS GmbH, “Generating application code,” https: //content.helpme-codesys.com/en/CODESYS%20Development% 20System/ cds creating application code for plc.html, accessed: 2026-05-08

work page 2026

[7] [7]

Creating a boot application,

CODESYS, “Creating a boot application,” https://content. helpme-codesys.com/en/CODESYS%20Development%20System/ cds creating a boot application.html, accessed: 2026-05-08

work page 2026

[8] [8]

MATIEC: IEC 61131-3 compiler,

OpenPLC, “MATIEC: IEC 61131-3 compiler,” https://openplcproject. gitlab.io/matiec/, accessed: 2026-05-08

work page 2026

[9] [9]

The Future of Industrial Automation,

AUTONOMY, “The Future of Industrial Automation,” https:// autonomylogic.com/, accessed: 2026-05-11

work page 2026

[10] [10]

Capstone: Next-gen disassembly framework,

N. A. Quynh, “Capstone: Next-gen disassembly framework,”Black Hat USA, vol. 5, no. 2, pp. 3–8, 2014

work page 2014

[11] [11]

IDA Pro disassembler and debugger,

Hex-Rays, “IDA Pro disassembler and debugger,” Software, 2024. [Online]. Available: https://hex-rays.com/ida-pro/ 14

work page 2024

[12] [12]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 5998–6008

work page 2017

[13] [13]

Inductive representation learning on large graphs,

W. L. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 1024–1034

work page 2017

[14] [14]

Ghidra software reverse engineering framework,

National Security Agency, “Ghidra software reverse engineering framework,” Software, 2019. [Online]. Available: https://ghidra-sre.org/

work page 2019

[15] [15]

Sok:(state of) the art of war: Offensive techniques in binary analysis,

Y . Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegelet al., “Sok:(state of) the art of war: Offensive techniques in binary analysis,” in2016 IEEE symposium on security and privacy (SP). IEEE, 2016, pp. 138–157

work page 2016

[16] [16]

An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,

D. Andriesse, X. Chen, V . Van Der Veen, A. Slowinska, and H. Bos, “An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,” in25th USENIX security symposium (USENIX security 16), 2016, pp. 583–600

work page 2016

[17] [17]

XDA: accurate, robust disassembly with transfer learning,

K. Pei, J. Guan, D. Williams-King, J. Yang, and S. Jana, “XDA: accurate, robust disassembly with transfer learning,” in28th Annual Network and Distributed System Security Symposium, NDSS 2021, virtually, February 21-25, 2021. The Internet Society, 2021

work page 2021

[18] [18]

Denial of engineering operations attacks in industrial control systems,

S. Senthivel, S. Dhungana, H. Yoo, I. Ahmed, and V . Roussev, “Denial of engineering operations attacks in industrial control systems,” in Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, 2018, pp. 319–329

work page 2018

[19] [19]

Automated reconstruction of control logic for programmable logic controller forensics,

S. A. Qasim, J. Lopez Jr, and I. Ahmed, “Automated reconstruction of control logic for programmable logic controller forensics,” inIn- ternational Conference on Information Security. Springer, 2019, pp. 402–422

work page 2019

[20] [20]

From control application to control logic: Plc decompile framework for industrial control system,

C. Sang, J. Wu, J. Li, and M. Guizani, “From control application to control logic: Plc decompile framework for industrial control system,” IEEE Transactions on Information Forensics and Security, vol. 19, pp. 8685–8700, 2024

work page 2024

[21] [21]

Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,

Y . Geng, X. Che, R. Ma, Q. Wei, M. Wang, and Y . Chen, “Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,”IEEE Internet of Things Journal, vol. 11, no. 5, pp. 8386–8400, 2023

work page 2023

[22] [22]

Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,

H. Benkraouda, A. Agrawal, D. Tychalas, M. Sazos, and M. Maniatakos, “Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,” inProceedings of the 5th Workshop on CPS&IoT Security and Privacy, 2023, pp. 83–89

work page 2023

[23] [23]

A survey of binary code similarity,

I. U. Haq and J. Caballero, “A survey of binary code similarity,”Acm computing surveys (csur), vol. 54, no. 3, pp. 1–38, 2021

work page 2021

[24] [24]

How machine learning is solving the binary function similarity problem,

A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y . Fratantonio, M. Man- souri, and D. Balzarotti, “How machine learning is solving the binary function similarity problem,” in31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 2099–2116

work page 2022

[25] [25]

Discovre: Efficient cross-architecture identification of bugs in binary code

S. Eschweiler, K. Yakdan, E. Gerhards-Padillaet al., “Discovre: Efficient cross-architecture identification of bugs in binary code.” inNdss, vol. 52, 2016, pp. 58–79

work page 2016

[26] [26]

Scalable graph-based bug search for firmware images,

Q. Feng, R. Zhou, C. Xu, Y . Cheng, B. Testa, and H. Yin, “Scalable graph-based bug search for firmware images,” inProceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 480–491

work page 2016

[27] [27]

Neural network- based graph embedding for cross-platform binary code similarity detec- tion,

X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, “Neural network- based graph embedding for cross-platform binary code similarity detec- tion,” inProceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 363–376

work page 2017

[28] [28]

Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,

J. Gao, X. Yang, Y . Fu, Y . Jiang, and J. Sun, “Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,” inPro- ceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 896–899

work page 2018

[29] [29]

Order matters: Semantic-aware neural networks for binary code similarity detection,

Z. Yu, R. Cao, Q. Tang, S. Nie, J. Huang, and S. Wu, “Order matters: Semantic-aware neural networks for binary code similarity detection,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 01, 2020, pp. 1145–1152

work page 2020

[30] [30]

Deepbindiff: Learning program- wide code representations for binary diffing,

Y . Duan, X. Li, J. Wang, and H. Yin, “Deepbindiff: Learning program- wide code representations for binary diffing,” in27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020. The Internet Society, 2020

work page 2020

[31] [31]

Safe: Self-attentive function embeddings for binary similarity,

L. Massarelli, G. A. Di Luna, F. Petroni, R. Baldoni, and L. Querzoni, “Safe: Self-attentive function embeddings for binary similarity,” in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 2019, pp. 309–329

work page 2019

[32] [32]

Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,

S. H. Ding, B. C. Fung, and P. Charland, “Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,” in2019 ieee symposium on security and privacy (sp). IEEE, 2019, pp. 472–489

work page 2019

[33] [33]

Palmtree: Learning an assembly language model for instruction embedding,

X. Li, Y . Qu, and H. Yin, “Palmtree: Learning an assembly language model for instruction embedding,” inProceedings of the 2021 ACM SIGSAC conference on computer and communications security, 2021, pp. 3236–3251

work page 2021

[34] [34]

Jtrans: Jump-aware transformer for binary code similarity detection,

H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang, “Jtrans: Jump-aware transformer for binary code similarity detection,” inProceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis, 2022, pp. 1–13

work page 2022

[35] [35]

Trex: Learning execution semantics from micro-traces for binary similarity,

K. Pei, Z. Xuan, J. Yang, S. Jana, and B. Ray, “Trex: Learning execution semantics from micro-traces for binary similarity,”arXiv preprint arXiv:2012.08680, 2020

work page arXiv 2012

[36] [36]

1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,

A. Jia, M. Fan, W. Jin, X. Xu, Z. Zhou, Q. Tang, S. Nie, S. Wu, and T. Liu, “1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,”ACM Transactions on Software Engineering and Methodology, vol. 32, no. 4, pp. 1–26, 2023

work page 2023

[37] [37]

Cross-inlining binary function similarity detection,

A. Jia, M. Fan, X. Xu, W. Jin, H. Wang, and T. Liu, “Cross-inlining binary function similarity detection,” inProceedings of the IEEE/ACM 46th international conference on software engineering, 2024, pp. 1–13

work page 2024

[38] [38]

{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,

D. Tychalas, H. Benkraouda, and M. Maniatakos, “{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,” in30th USENIX Security Sym- posium (USENIX Security 21), 2021, pp. 2847–2862

work page 2021

[39] [39]

Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,

A. Bytes, P. H. N. Rajput, C. Doumanidis, M. Maniatakos, J. Zhou, and N. O. Tippenhauer, “Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,” inProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses, 2023, pp. 499–512

work page 2023

[40] [40]

Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems

C. Villa, C. Doumanidis, H. Lamri, P. H. N. Rajput, and M. Maniatakos, “Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems.” inNDSS, 2025

work page 2025

[41] [41]

Plcverif: Status of a formal verification tool for programmable logic controller,

I. D. Lopez-Miguel, J.-C. Tournier, and B. F. Adiego, “Plcverif: Status of a formal verification tool for programmable logic controller,”arXiv preprint arXiv:2203.17253, 2022

work page arXiv 2022

[42] [42]

Binary-level formal verification based automatic security ensurement for plc in industrial iot,

X. Zhang, J. Li, J. Wu, G. Chen, Y . Meng, H. Zhu, and X. Zhang, “Binary-level formal verification based automatic security ensurement for plc in industrial iot,”IEEE Transactions on Dependable and Secure Computing, vol. 22, no. 3, pp. 2211–2226, 2024

work page 2024