PLC-BinX: A Cross-Platform Binary Code Analysis Framework for PLC Binaries

Ang Jia; He Jiang; Xiaochen LI; Yaxin Duan; Zhenzhou Tian; Zhilei Ren

arxiv: 2605.17392 · v3 · pith:XP3V23ZZnew · submitted 2026-05-17 · 💻 cs.SE

PLC-BinX: A Cross-Platform Binary Code Analysis Framework for PLC Binaries

Ang Jia , Yaxin Duan , He Jiang , Zhenzhou Tian , Zhilei Ren , Xiaochen Li This is my paper

Pith reviewed 2026-06-30 19:26 UTC · model grok-4.3

classification 💻 cs.SE

keywords PLC binariesbinary code analysiscross-platform reverse engineeringICS securitytoolchain predictionfunctionality predictionsemantic representation

0 comments

The pith

PLC-BinX applies a three-stage workflow to convert PLC binaries from four platforms into function-level semantic representations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents PLC-BinX as a framework to perform binary code analysis on PLCs despite differences in formats, mixed runtime code, and weak semantic features for further tasks. It structures the work as cross-platform reverse engineering followed by core function identification and then semantic representation. On a collection of 2431 binaries the method reaches perfect accuracy when identifying the originating toolchain and roughly 49 percent F1 when classifying functionality into 22 categories. A reader would care because PLCs control critical industrial systems and compiled binaries are often the only available artifacts for security checks. The reported results indicate that the extracted representations support concrete prediction tasks that matter for auditing deployed controllers.

Core claim

PLC-BinX applies a three-stage PLC binary analysis workflow, including cross-platform reverse engineering, core function identification, and function-level semantic representation, to analyze PLC binaries from four platforms: CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3. We evaluate PLC-BinX on PLC-BEAD, which contains 2,431 PLC binaries across four platforms, using two downstream tasks: toolchain prediction and functionality prediction. Experimental results show that PLC-BinX achieves 100.00% precision, recall, and F1 in toolchain prediction, and 51.43% precision, 49.38% recall, and 49.18% F1 in functionality prediction over 22 labels.

What carries the argument

The three-stage workflow of cross-platform reverse engineering, core function identification, and function-level semantic representation.

If this is right

Raw PLC binaries from the four tested platforms can be turned into representations that support perfect toolchain identification.
The same representations allow functionality classification at nearly 50 percent F1 across 22 labels.
The approach directly enables downstream security tasks such as deployed-binary auditing for industrial control systems.
The workflow produces usable function-level features without requiring source code or platform-specific manual tuning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The representations may transfer to other embedded control devices that share similar binary heterogeneity.
Further downstream tasks such as vulnerability detection or malware identification could be tested on the same extracted features.
Accuracy on functionality prediction leaves open the possibility that additional function identification rules would raise the scores.

Load-bearing premise

The three-stage workflow of cross-platform reverse engineering, core function identification, and function-level semantic representation is sufficient to overcome heterogeneous formats, entangled program semantics, and limited semantic representations.

What would settle it

Running the same toolchain and functionality prediction tasks on a fresh collection of PLC binaries from a fifth platform or with a different set of 22 functionality labels would show whether the reported precision, recall, and F1 scores are preserved.

Figures

Figures reproduced from arXiv: 2605.17392 by Ang Jia, He Jiang, Xiaochen LI, Yaxin Duan, Zhenzhou Tian, Zhilei Ren.

**Figure 1.** Figure 1: shows how PLCs connect supervisory components, field devices, and physical processes in a typical ICS. PLCs receive sensor inputs from the field, execute control logic, and output commands to actuators such as motors, valves, pumps, and relays. In a typical deployment, supervisory components such as engineering workstations, Human–Machine Interfaces (HMIs), or SCADA servers configure and monitor the contro… view at source ↗

**Figure 2.** Figure 2: Example of PLC source organization using POUs. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the PLC-BinX workflow. TABLE I PLC PLATFORMS AND BINARY FORMS IN THE DATASET. Platform Binary form # Binaries CODESYS v3 [7] .app container 555 GEB [3] ARM ELF executable 617 OpenPLC v2 [9] PE executable 619 OpenPLC v3 [9] PE executable 640 Total – 2,431 A. Input PLC Binaries We use PLC-BEAD [3] as the dataset for our study. To the best of our knowledge, PLC-BEAD is the only publicly available… view at source ↗

**Figure 4.** Figure 4: Workflow of PLC-BinX for toolchain prediction. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Workflow of PLC-BinX for functionality prediction. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

As emerging attacks increasingly target Industrial Control Systems (ICS), the security of Programmable Logic Controllers (PLCs) has become a critical concern. Binary Code Analysis (BCA), which enables analysts to analyze compiled programs, is essential for ICS security tasks such as deployed-binary auditing. However, automated BCA for PLC binaries remains challenging due to three key issues: heterogeneous binary formats across PLC platforms, entangled program semantics with runtime code, and limited semantic representations for downstream tasks. To resolve these challenges, we present PLC-BinX, a cross-platform BCA framework for PLC binaries. PLC-BinX applies a three-stage PLC binary analysis workflow, including cross-platform reverse engineering, core function identification, and function-level semantic representation, to analyze PLC binaries from four platforms: CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3. We evaluate PLC-BinX on PLC-BEAD, which contains 2,431 PLC binaries across four platforms, using two downstream tasks: toolchain prediction and functionality prediction. Experimental results show that PLC-BinX achieves 100.00% precision, recall, and F1 in toolchain prediction, and 51.43% precision, 49.38% recall, and 49.18% F1 in functionality prediction over 22 labels. These results demonstrate that PLC-BinX can transform raw PLC binaries into effective function-level semantic representations for PLC binary code analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The 100% toolchain prediction likely comes from format detection in stage one, while the 49% functionality results lack baselines to validate the semantic stages.

read the letter

The main thing to know is that PLC-BinX reports perfect scores on identifying the toolchain from binaries across four platforms, but this task aligns closely with what the first reverse-engineering stage would do by parsing headers and formats. The functionality prediction task, at 51% precision and 49% F1 over 22 labels, is the more relevant test of semantic representations yet remains modest.

The paper brings a new dataset, PLC-BEAD, with 2431 binaries from CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3. It also defines a three-stage workflow aimed at the specific problems of heterogeneous PLC formats, entangled code, and weak downstream representations. That focus on a practical ICS security gap is the clearest contribution.

The soft spots are the absence of any baseline comparisons or ablations for the functionality task, plus no breakdown of how much each stage adds. Without those, it is difficult to tell whether the later stages overcome entangled semantics or simply ride on the initial parsing. The abstract supplies performance numbers but little on method details or error patterns.

This work is for researchers in industrial control systems security and binary analysis who need PLC-specific resources. A reader in that area could use the dataset and workflow sketch as a starting point.

It deserves peer review because it supplies new data and testable claims on a real problem. I would send it to referees with requests for baselines and stage-wise analysis.

Referee Report

3 major / 2 minor

Summary. The paper introduces PLC-BinX, a cross-platform binary code analysis framework for PLC binaries that uses a three-stage workflow (cross-platform reverse engineering, core function identification, and function-level semantic representation) to address heterogeneous formats, entangled semantics, and limited representations. It is evaluated on the PLC-BEAD dataset of 2,431 binaries from four platforms (CODESYS v3, GEB, OpenPLC v2, OpenPLC v3) on two tasks: toolchain prediction (reported 100.00% precision/recall/F1) and functionality prediction (51.43% precision, 49.38% recall, 49.18% F1 over 22 labels). The central claim is that the framework transforms raw PLC binaries into effective function-level semantic representations.

Significance. If the results hold under proper validation, the work could contribute to ICS security by enabling binary analysis across PLC platforms and by releasing the PLC-BEAD dataset. The three-stage workflow targets real challenges in the domain. However, the reported metrics do not yet establish that the semantic stages are load-bearing or superior to simpler methods.

major comments (3)

[Abstract, Evaluation] Abstract and Evaluation section: The 100.00% precision/recall/F1 on toolchain prediction does not validate the full workflow or the semantic representation stages. Toolchain prediction is equivalent to platform identification, which can be performed by format detection (headers, magic bytes, or string constants) in the cross-platform reverse engineering stage alone; the core function identification and function-level semantic representation stages are not shown to be necessary for this result.
[Evaluation] Evaluation section: The functionality prediction task reports modest performance (51.43% precision, 49.38% recall, 49.18% F1 over 22 labels) but provides no baseline comparisons, ablations isolating the contribution of each workflow stage, or error analysis. Without these, it is not possible to determine whether the semantic representations are effective or whether the workflow overcomes entangled program semantics.
[Abstract, §3] Abstract and §3 (Workflow): The claim that the three-stage workflow is sufficient to overcome heterogeneous formats, entangled semantics, and limited representations rests on the assumption that core function identification and semantic representation add value beyond the first stage, but no evidence (e.g., stage-wise performance or comparison to format-only baselines) is supplied to support this.

minor comments (2)

[Abstract] The abstract states concrete performance numbers but does not describe dataset construction criteria, label definitions for the 22 functionality classes, or the exact method used to obtain the reported metrics.
[Evaluation] Notation for the semantic representation vectors and the downstream classifiers is not introduced in the provided abstract; this should be clarified in the main text for reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point-by-point below, agreeing with the need for clarification and additional analysis where the current evaluation falls short, and outlining the revisions we will make.

read point-by-point responses

Referee: [Abstract, Evaluation] Abstract and Evaluation section: The 100.00% precision/recall/F1 on toolchain prediction does not validate the full workflow or the semantic representation stages. Toolchain prediction is equivalent to platform identification, which can be performed by format detection (headers, magic bytes, or string constants) in the cross-platform reverse engineering stage alone; the core function identification and function-level semantic representation stages are not shown to be necessary for this result.

Authors: We agree that the 100% toolchain prediction result can be achieved via format detection in the first stage alone and does not demonstrate the necessity of the later stages. This metric primarily validates the cross-platform reverse engineering component of the workflow. The full three-stage approach is motivated by the functionality prediction task. We will revise the abstract and evaluation section to explicitly distinguish the role of each stage and clarify that toolchain prediction validates the initial reverse engineering process rather than the semantic stages. revision: yes
Referee: [Evaluation] Evaluation section: The functionality prediction task reports modest performance (51.43% precision, 49.38% recall, 49.18% F1 over 22 labels) but provides no baseline comparisons, ablations isolating the contribution of each workflow stage, or error analysis. Without these, it is not possible to determine whether the semantic representations are effective or whether the workflow overcomes entangled program semantics.

Authors: We acknowledge that the evaluation lacks baseline comparisons, ablations, and error analysis, making it difficult to isolate the contribution of the core function identification and semantic representation stages. In the revised manuscript, we will add baseline experiments (e.g., format-only or first-stage-only approaches), ablation studies removing individual stages, and error analysis across the 22 labels to better demonstrate whether the semantic stages address entangled semantics. revision: yes
Referee: [Abstract, §3] Abstract and §3 (Workflow): The claim that the three-stage workflow is sufficient to overcome heterogeneous formats, entangled semantics, and limited representations rests on the assumption that core function identification and semantic representation add value beyond the first stage, but no evidence (e.g., stage-wise performance or comparison to format-only baselines) is supplied to support this.

Authors: We recognize that the manuscript does not currently provide stage-wise performance or format-only baseline comparisons to support the added value of the later stages. We will incorporate these analyses into the evaluation section of the revised manuscript, including stage-wise metrics and direct comparisons to format-only methods, to substantiate the claim that the full workflow is required for effective function-level semantic representations. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical results on external dataset

full rationale

The paper describes a three-stage framework evaluated via direct experiments on the PLC-BEAD dataset (2,431 binaries) for toolchain and functionality prediction tasks. No equations, fitted parameters, self-definitional constructs, or load-bearing self-citations are present in the provided text. Claims reduce to reported precision/recall/F1 metrics rather than any derivation that collapses to inputs by construction. This matches the default case of a self-contained empirical paper against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, invented entities, or detailed axioms are stated. The central claim rests on the unverified assumption that the chosen platforms and dataset adequately represent real-world PLC binary heterogeneity.

axioms (1)

domain assumption The four listed platforms and the PLC-BEAD collection sufficiently capture the heterogeneity, entanglement, and representation challenges described.
Invoked by the choice of evaluation targets and the claim that results generalize to PLC binary analysis.

pith-pipeline@v0.9.1-grok · 5804 in / 1373 out tokens · 41716 ms · 2026-06-30T19:26:30.714201+00:00 · methodology

Review history (3 revisions) →

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 2 canonical work pages

[1]

Stuxnet: What has changed?

D. E. Denning, “Stuxnet: What has changed?”Future Internet, vol. 4, no. 3, pp. 672–687, 2012

2012
[2]

Backdooring CODESYS applications via vulnerability chaining,

Nozomi Networks Labs, “Backdooring CODESYS applications via vulnerability chaining,” Nozomi Networks Blog, Apr. 2026, accessed: 2026-05-08. [Online]. Available: https://www.nozominetworks.com/ blog/backdooring-codesys-applications-via-vulnerability-chaining

2026
[3]

Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,

Y . G. Achamyeleh, S.-Y . Yu, G. Q. Araya, and M. A. Al Faruque, “Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, 2025, pp. 5260–5269

2025
[4]

ICSREF: A framework for automated reverse engineering of industrial control systems binaries,

A. Keliris and M. Maniatakos, “ICSREF: A framework for automated reverse engineering of industrial control systems binaries,” in26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society, 2019

2019
[5]

IEC 61131-3 programming languages,

PLCopen, “IEC 61131-3 programming languages,” Technical documen- tation, 2013. [Online]. Available: https://plcopen.org/iec-61131-3

2013
[6]

Generating application code,

CODESYS GmbH, “Generating application code,” https: //content.helpme-codesys.com/en/CODESYS%20Development% 20System/ cds creating application code for plc.html, accessed: 2026-05-08

2026
[7]

Creating a boot application,

CODESYS, “Creating a boot application,” https://content. helpme-codesys.com/en/CODESYS%20Development%20System/ cds creating a boot application.html, accessed: 2026-05-08

2026
[8]

MATIEC: IEC 61131-3 compiler,

OpenPLC, “MATIEC: IEC 61131-3 compiler,” https://openplcproject. gitlab.io/matiec/, accessed: 2026-05-08

2026
[9]

The Future of Industrial Automation,

AUTONOMY, “The Future of Industrial Automation,” https:// autonomylogic.com/, accessed: 2026-05-11

2026
[10]

Capstone: Next-gen disassembly framework,

N. A. Quynh, “Capstone: Next-gen disassembly framework,”Black Hat USA, vol. 5, no. 2, pp. 3–8, 2014

2014
[11]

IDA Pro disassembler and debugger,

Hex-Rays, “IDA Pro disassembler and debugger,” Software, 2024. [Online]. Available: https://hex-rays.com/ida-pro/ 14

2024
[12]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 5998–6008

2017
[13]

Inductive representation learning on large graphs,

W. L. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 1024–1034

2017
[14]

Ghidra software reverse engineering framework,

National Security Agency, “Ghidra software reverse engineering framework,” Software, 2019. [Online]. Available: https://ghidra-sre.org/

2019
[15]

Sok:(state of) the art of war: Offensive techniques in binary analysis,

Y . Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegelet al., “Sok:(state of) the art of war: Offensive techniques in binary analysis,” in2016 IEEE symposium on security and privacy (SP). IEEE, 2016, pp. 138–157

2016
[16]

An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,

D. Andriesse, X. Chen, V . Van Der Veen, A. Slowinska, and H. Bos, “An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,” in25th USENIX security symposium (USENIX security 16), 2016, pp. 583–600

2016
[17]

XDA: accurate, robust disassembly with transfer learning,

K. Pei, J. Guan, D. Williams-King, J. Yang, and S. Jana, “XDA: accurate, robust disassembly with transfer learning,” in28th Annual Network and Distributed System Security Symposium, NDSS 2021, virtually, February 21-25, 2021. The Internet Society, 2021

2021
[18]

Denial of engineering operations attacks in industrial control systems,

S. Senthivel, S. Dhungana, H. Yoo, I. Ahmed, and V . Roussev, “Denial of engineering operations attacks in industrial control systems,” in Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, 2018, pp. 319–329

2018
[19]

Automated reconstruction of control logic for programmable logic controller forensics,

S. A. Qasim, J. Lopez Jr, and I. Ahmed, “Automated reconstruction of control logic for programmable logic controller forensics,” inIn- ternational Conference on Information Security. Springer, 2019, pp. 402–422

2019
[20]

From control application to control logic: Plc decompile framework for industrial control system,

C. Sang, J. Wu, J. Li, and M. Guizani, “From control application to control logic: Plc decompile framework for industrial control system,” IEEE Transactions on Information Forensics and Security, vol. 19, pp. 8685–8700, 2024

2024
[21]

Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,

Y . Geng, X. Che, R. Ma, Q. Wei, M. Wang, and Y . Chen, “Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,”IEEE Internet of Things Journal, vol. 11, no. 5, pp. 8386–8400, 2023

2023
[22]

Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,

H. Benkraouda, A. Agrawal, D. Tychalas, M. Sazos, and M. Maniatakos, “Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,” inProceedings of the 5th Workshop on CPS&IoT Security and Privacy, 2023, pp. 83–89

2023
[23]

A survey of binary code similarity,

I. U. Haq and J. Caballero, “A survey of binary code similarity,”Acm computing surveys (csur), vol. 54, no. 3, pp. 1–38, 2021

2021
[24]

How machine learning is solving the binary function similarity problem,

A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y . Fratantonio, M. Man- souri, and D. Balzarotti, “How machine learning is solving the binary function similarity problem,” in31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 2099–2116

2022
[25]

Discovre: Efficient cross-architecture identification of bugs in binary code

S. Eschweiler, K. Yakdan, E. Gerhards-Padillaet al., “Discovre: Efficient cross-architecture identification of bugs in binary code.” inNdss, vol. 52, 2016, pp. 58–79

2016
[26]

Scalable graph-based bug search for firmware images,

Q. Feng, R. Zhou, C. Xu, Y . Cheng, B. Testa, and H. Yin, “Scalable graph-based bug search for firmware images,” inProceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 480–491

2016
[27]

Neural network- based graph embedding for cross-platform binary code similarity detec- tion,

X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, “Neural network- based graph embedding for cross-platform binary code similarity detec- tion,” inProceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 363–376

2017
[28]

Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,

J. Gao, X. Yang, Y . Fu, Y . Jiang, and J. Sun, “Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,” inPro- ceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 896–899

2018
[29]

Order matters: Semantic-aware neural networks for binary code similarity detection,

Z. Yu, R. Cao, Q. Tang, S. Nie, J. Huang, and S. Wu, “Order matters: Semantic-aware neural networks for binary code similarity detection,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 01, 2020, pp. 1145–1152

2020
[30]

Deepbindiff: Learning program- wide code representations for binary diffing,

Y . Duan, X. Li, J. Wang, and H. Yin, “Deepbindiff: Learning program- wide code representations for binary diffing,” in27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020. The Internet Society, 2020

2020
[31]

Safe: Self-attentive function embeddings for binary similarity,

L. Massarelli, G. A. Di Luna, F. Petroni, R. Baldoni, and L. Querzoni, “Safe: Self-attentive function embeddings for binary similarity,” in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 2019, pp. 309–329

2019
[32]

Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,

S. H. Ding, B. C. Fung, and P. Charland, “Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,” in2019 ieee symposium on security and privacy (sp). IEEE, 2019, pp. 472–489

2019
[33]

Palmtree: Learning an assembly language model for instruction embedding,

X. Li, Y . Qu, and H. Yin, “Palmtree: Learning an assembly language model for instruction embedding,” inProceedings of the 2021 ACM SIGSAC conference on computer and communications security, 2021, pp. 3236–3251

2021
[34]

Jtrans: Jump-aware transformer for binary code similarity detection,

H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang, “Jtrans: Jump-aware transformer for binary code similarity detection,” inProceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis, 2022, pp. 1–13

2022
[35]

Trex: Learning execution semantics from micro-traces for binary similarity,

K. Pei, Z. Xuan, J. Yang, S. Jana, and B. Ray, “Trex: Learning execution semantics from micro-traces for binary similarity,”arXiv preprint arXiv:2012.08680, 2020

work page arXiv 2012
[36]

1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,

A. Jia, M. Fan, W. Jin, X. Xu, Z. Zhou, Q. Tang, S. Nie, S. Wu, and T. Liu, “1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,”ACM Transactions on Software Engineering and Methodology, vol. 32, no. 4, pp. 1–26, 2023

2023
[37]

Cross-inlining binary function similarity detection,

A. Jia, M. Fan, X. Xu, W. Jin, H. Wang, and T. Liu, “Cross-inlining binary function similarity detection,” inProceedings of the IEEE/ACM 46th international conference on software engineering, 2024, pp. 1–13

2024
[38]

{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,

D. Tychalas, H. Benkraouda, and M. Maniatakos, “{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,” in30th USENIX Security Sym- posium (USENIX Security 21), 2021, pp. 2847–2862

2021
[39]

Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,

A. Bytes, P. H. N. Rajput, C. Doumanidis, M. Maniatakos, J. Zhou, and N. O. Tippenhauer, “Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,” inProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses, 2023, pp. 499–512

2023
[40]

Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems

C. Villa, C. Doumanidis, H. Lamri, P. H. N. Rajput, and M. Maniatakos, “Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems.” inNDSS, 2025

2025
[41]

Plcverif: Status of a formal verification tool for programmable logic controller,

I. D. Lopez-Miguel, J.-C. Tournier, and B. F. Adiego, “Plcverif: Status of a formal verification tool for programmable logic controller,”arXiv preprint arXiv:2203.17253, 2022

work page arXiv 2022
[42]

Binary-level formal verification based automatic security ensurement for plc in industrial iot,

X. Zhang, J. Li, J. Wu, G. Chen, Y . Meng, H. Zhu, and X. Zhang, “Binary-level formal verification based automatic security ensurement for plc in industrial iot,”IEEE Transactions on Dependable and Secure Computing, vol. 22, no. 3, pp. 2211–2226, 2024

2024

[1] [1]

Stuxnet: What has changed?

D. E. Denning, “Stuxnet: What has changed?”Future Internet, vol. 4, no. 3, pp. 672–687, 2012

2012

[2] [2]

Backdooring CODESYS applications via vulnerability chaining,

Nozomi Networks Labs, “Backdooring CODESYS applications via vulnerability chaining,” Nozomi Networks Blog, Apr. 2026, accessed: 2026-05-08. [Online]. Available: https://www.nozominetworks.com/ blog/backdooring-codesys-applications-via-vulnerability-chaining

2026

[3] [3]

Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,

Y . G. Achamyeleh, S.-Y . Yu, G. Q. Araya, and M. A. Al Faruque, “Bridging the binary analysis gap: A cross-compiler dataset and neural framework for industrial control systems,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, 2025, pp. 5260–5269

2025

[4] [4]

ICSREF: A framework for automated reverse engineering of industrial control systems binaries,

A. Keliris and M. Maniatakos, “ICSREF: A framework for automated reverse engineering of industrial control systems binaries,” in26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society, 2019

2019

[5] [5]

IEC 61131-3 programming languages,

PLCopen, “IEC 61131-3 programming languages,” Technical documen- tation, 2013. [Online]. Available: https://plcopen.org/iec-61131-3

2013

[6] [6]

Generating application code,

CODESYS GmbH, “Generating application code,” https: //content.helpme-codesys.com/en/CODESYS%20Development% 20System/ cds creating application code for plc.html, accessed: 2026-05-08

2026

[7] [7]

Creating a boot application,

CODESYS, “Creating a boot application,” https://content. helpme-codesys.com/en/CODESYS%20Development%20System/ cds creating a boot application.html, accessed: 2026-05-08

2026

[8] [8]

MATIEC: IEC 61131-3 compiler,

OpenPLC, “MATIEC: IEC 61131-3 compiler,” https://openplcproject. gitlab.io/matiec/, accessed: 2026-05-08

2026

[9] [9]

The Future of Industrial Automation,

AUTONOMY, “The Future of Industrial Automation,” https:// autonomylogic.com/, accessed: 2026-05-11

2026

[10] [10]

Capstone: Next-gen disassembly framework,

N. A. Quynh, “Capstone: Next-gen disassembly framework,”Black Hat USA, vol. 5, no. 2, pp. 3–8, 2014

2014

[11] [11]

IDA Pro disassembler and debugger,

Hex-Rays, “IDA Pro disassembler and debugger,” Software, 2024. [Online]. Available: https://hex-rays.com/ida-pro/ 14

2024

[12] [12]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 5998–6008

2017

[13] [13]

Inductive representation learning on large graphs,

W. L. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” inAdvances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 1024–1034

2017

[14] [14]

Ghidra software reverse engineering framework,

National Security Agency, “Ghidra software reverse engineering framework,” Software, 2019. [Online]. Available: https://ghidra-sre.org/

2019

[15] [15]

Sok:(state of) the art of war: Offensive techniques in binary analysis,

Y . Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegelet al., “Sok:(state of) the art of war: Offensive techniques in binary analysis,” in2016 IEEE symposium on security and privacy (SP). IEEE, 2016, pp. 138–157

2016

[16] [16]

An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,

D. Andriesse, X. Chen, V . Van Der Veen, A. Slowinska, and H. Bos, “An {In-Depth}analysis of disassembly on{Full-Scale}x86/x64 binaries,” in25th USENIX security symposium (USENIX security 16), 2016, pp. 583–600

2016

[17] [17]

XDA: accurate, robust disassembly with transfer learning,

K. Pei, J. Guan, D. Williams-King, J. Yang, and S. Jana, “XDA: accurate, robust disassembly with transfer learning,” in28th Annual Network and Distributed System Security Symposium, NDSS 2021, virtually, February 21-25, 2021. The Internet Society, 2021

2021

[18] [18]

Denial of engineering operations attacks in industrial control systems,

S. Senthivel, S. Dhungana, H. Yoo, I. Ahmed, and V . Roussev, “Denial of engineering operations attacks in industrial control systems,” in Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, 2018, pp. 319–329

2018

[19] [19]

Automated reconstruction of control logic for programmable logic controller forensics,

S. A. Qasim, J. Lopez Jr, and I. Ahmed, “Automated reconstruction of control logic for programmable logic controller forensics,” inIn- ternational Conference on Information Security. Springer, 2019, pp. 402–422

2019

[20] [20]

From control application to control logic: Plc decompile framework for industrial control system,

C. Sang, J. Wu, J. Li, and M. Guizani, “From control application to control logic: Plc decompile framework for industrial control system,” IEEE Transactions on Information Forensics and Security, vol. 19, pp. 8685–8700, 2024

2024

[21] [21]

Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,

Y . Geng, X. Che, R. Ma, Q. Wei, M. Wang, and Y . Chen, “Control logic attack detection and forensics through reverse-engineering and verifying plc control applications,”IEEE Internet of Things Journal, vol. 11, no. 5, pp. 8386–8400, 2023

2023

[22] [22]

Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,

H. Benkraouda, A. Agrawal, D. Tychalas, M. Sazos, and M. Maniatakos, “Towards plc-specific binary analysis tools: An investigation of codesys- compiled plc software applications,” inProceedings of the 5th Workshop on CPS&IoT Security and Privacy, 2023, pp. 83–89

2023

[23] [23]

A survey of binary code similarity,

I. U. Haq and J. Caballero, “A survey of binary code similarity,”Acm computing surveys (csur), vol. 54, no. 3, pp. 1–38, 2021

2021

[24] [24]

How machine learning is solving the binary function similarity problem,

A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y . Fratantonio, M. Man- souri, and D. Balzarotti, “How machine learning is solving the binary function similarity problem,” in31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 2099–2116

2022

[25] [25]

Discovre: Efficient cross-architecture identification of bugs in binary code

S. Eschweiler, K. Yakdan, E. Gerhards-Padillaet al., “Discovre: Efficient cross-architecture identification of bugs in binary code.” inNdss, vol. 52, 2016, pp. 58–79

2016

[26] [26]

Scalable graph-based bug search for firmware images,

Q. Feng, R. Zhou, C. Xu, Y . Cheng, B. Testa, and H. Yin, “Scalable graph-based bug search for firmware images,” inProceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 480–491

2016

[27] [27]

Neural network- based graph embedding for cross-platform binary code similarity detec- tion,

X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, “Neural network- based graph embedding for cross-platform binary code similarity detec- tion,” inProceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 363–376

2017

[28] [28]

Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,

J. Gao, X. Yang, Y . Fu, Y . Jiang, and J. Sun, “Vulseeker: A semantic learning based vulnerability seeker for cross-platform binary,” inPro- ceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 896–899

2018

[29] [29]

Order matters: Semantic-aware neural networks for binary code similarity detection,

Z. Yu, R. Cao, Q. Tang, S. Nie, J. Huang, and S. Wu, “Order matters: Semantic-aware neural networks for binary code similarity detection,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 01, 2020, pp. 1145–1152

2020

[30] [30]

Deepbindiff: Learning program- wide code representations for binary diffing,

Y . Duan, X. Li, J. Wang, and H. Yin, “Deepbindiff: Learning program- wide code representations for binary diffing,” in27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020. The Internet Society, 2020

2020

[31] [31]

Safe: Self-attentive function embeddings for binary similarity,

L. Massarelli, G. A. Di Luna, F. Petroni, R. Baldoni, and L. Querzoni, “Safe: Self-attentive function embeddings for binary similarity,” in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 2019, pp. 309–329

2019

[32] [32]

Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,

S. H. Ding, B. C. Fung, and P. Charland, “Asm2vec: Boosting static representation robustness for binary clone search against code obfusca- tion and compiler optimization,” in2019 ieee symposium on security and privacy (sp). IEEE, 2019, pp. 472–489

2019

[33] [33]

Palmtree: Learning an assembly language model for instruction embedding,

X. Li, Y . Qu, and H. Yin, “Palmtree: Learning an assembly language model for instruction embedding,” inProceedings of the 2021 ACM SIGSAC conference on computer and communications security, 2021, pp. 3236–3251

2021

[34] [34]

Jtrans: Jump-aware transformer for binary code similarity detection,

H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang, “Jtrans: Jump-aware transformer for binary code similarity detection,” inProceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis, 2022, pp. 1–13

2022

[35] [35]

Trex: Learning execution semantics from micro-traces for binary similarity,

K. Pei, Z. Xuan, J. Yang, S. Jana, and B. Ray, “Trex: Learning execution semantics from micro-traces for binary similarity,”arXiv preprint arXiv:2012.08680, 2020

work page arXiv 2012

[36] [36]

1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,

A. Jia, M. Fan, W. Jin, X. Xu, Z. Zhou, Q. Tang, S. Nie, S. Wu, and T. Liu, “1-to-1 or 1-to-n? investigating the effect of function inlining on binary similarity analysis,”ACM Transactions on Software Engineering and Methodology, vol. 32, no. 4, pp. 1–26, 2023

2023

[37] [37]

Cross-inlining binary function similarity detection,

A. Jia, M. Fan, X. Xu, W. Jin, H. Wang, and T. Liu, “Cross-inlining binary function similarity detection,” inProceedings of the IEEE/ACM 46th international conference on software engineering, 2024, pp. 1–13

2024

[38] [38]

{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,

D. Tychalas, H. Benkraouda, and M. Maniatakos, “{ICSFuzz}: Ma- nipulating{I/Os}and repurposing binary code to enable instrumented fuzzing in{ICS}control applications,” in30th USENIX Security Sym- posium (USENIX Security 21), 2021, pp. 2847–2862

2021

[39] [39]

Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,

A. Bytes, P. H. N. Rajput, C. Doumanidis, M. Maniatakos, J. Zhou, and N. O. Tippenhauer, “Fieldfuzz: In situ blackbox fuzzing of proprietary industrial automation runtimes via the network,” inProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses, 2023, pp. 499–512

2023

[40] [40]

Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems

C. Villa, C. Doumanidis, H. Lamri, P. H. N. Rajput, and M. Maniatakos, “Icsquartz: Scan cycle-aware and vendor-agnostic fuzzing for industrial control systems.” inNDSS, 2025

2025

[41] [41]

Plcverif: Status of a formal verification tool for programmable logic controller,

I. D. Lopez-Miguel, J.-C. Tournier, and B. F. Adiego, “Plcverif: Status of a formal verification tool for programmable logic controller,”arXiv preprint arXiv:2203.17253, 2022

work page arXiv 2022

[42] [42]

Binary-level formal verification based automatic security ensurement for plc in industrial iot,

X. Zhang, J. Li, J. Wu, G. Chen, Y . Meng, H. Zhu, and X. Zhang, “Binary-level formal verification based automatic security ensurement for plc in industrial iot,”IEEE Transactions on Dependable and Secure Computing, vol. 22, no. 3, pp. 2211–2226, 2024

2024