LLM-assisted Generation of Pseudo-C2 Servers for IoT Malware Dynamic Analysis

K. Hasui; M. Hashimoto; M. Shimamura; S. Matsugaya

arxiv: 2606.21349 · v2 · pith:4IW57QSUnew · submitted 2026-06-19 · 💻 cs.CR

LLM-assisted Generation of Pseudo-C2 Servers for IoT Malware Dynamic Analysis

K. Hasui , S. Matsugaya , M. Shimamura , M. Hashimoto This is my paper

Pith reviewed 2026-06-26 14:03 UTC · model grok-4.3

classification 💻 cs.CR

keywords IoT malwarecommand and controldynamic analysisLLMprotocol extractionMiraipseudo-C2 serverbotnet analysis

0 comments

The pith

An LLM plus decompiler extracts full C2 protocols from malware binaries to build working pseudo servers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that Ghidra decompilation combined with an LLM can read the control logic inside an IoT malware binary and recover its exact communication rules with a command server. Those rules are then turned into a standalone fake server that talks to the malware sample and triggers its attack code. Experiments on Mirai show the extraction matches every ground-truth element and the generated server reproduces most attack behaviors. The same pipeline succeeds on a deliberately altered version of Mirai, indicating the LLM is reading the binary structures themselves rather than recalling known code.

Core claim

The system extracts all 20 core protocol elements from the Mirai binary with 100 percent agreement to the ground truth and produces a pseudo-C2 server that fully reproduces seven of ten DDoS attack vectors with matching behavior; the identical end-to-end process succeeds on a source-modified Mirai variant, showing the LLM infers specifications from binary structures without pre-trained knowledge of the malware.

What carries the argument

LLM semantic interpretation of decompiled binary control structures to recover protocol elements, followed by automated generation of a pseudo-C2 server that implements those elements.

If this is right

Dormant malware samples without live C2 infrastructure become amenable to full dynamic analysis.
Protocol extraction reaches complete coverage of the 20 core elements on the tested family.
A majority of attack vectors can be reproduced consistently by the generated server.
The method operates on source-altered variants, confirming reliance on observable binary structures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same extraction-plus-generation steps could be tried on other common IoT botnet families to test breadth.
Pseudo servers built this way could serve as controlled environments for testing new detection signatures.
If the accuracy holds across families, large-scale automated analysis of collected samples becomes feasible without waiting for real C2 hosts.

Load-bearing premise

The LLM can correctly map the meaning of binary control structures onto protocol commands even for customized malware variants whose source differs from any training data.

What would settle it

Apply the pipeline to a new customized variant and observe whether the generated pseudo-C2 elicits the malware's expected attack commands or produces mismatched protocol behavior.

Figures

Figures reproduced from arXiv: 2606.21349 by K. Hasui, M. Hashimoto, M. Shimamura, S. Matsugaya.

read the original abstract

Most IoT malware operates as botnets dependent on Command and Control (C2) servers, but the short-lived nature of attack infrastructure often leaves samples dormant without C2 communication, hindering dynamic analysis. This paper proposes a system that combines Ghidra with a Large Language Model (LLM) to extract communication specifications from a malware binary and automatically generate a pseudo-C2 server. Experiments using Mirai demonstrate that the proposed system semantically interprets binary control structures and extracts all 20 core protocol elements in agreement with the ground truth (100\% specification extraction accuracy). The generated pseudo-C2 server fully reproduces seven of ten DDoS attack vectors with attack behavior consistent with the original C2. When applied to a customized variant created by modifying the publicly available Mirai source code, the method succeeds end-to-end -- from specification extraction through pseudo-C2 generation to attack reproduction -- demonstrating that the LLM infers specifications from binary structures without relying on pre-trained knowledge. This approach extends the applicability of LLMs from analysis assistance to the automated construction of dynamic analysis environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The pipeline turns Ghidra output into working pseudo-C2 servers for Mirai but the evidence does not yet prove the LLM infers protocols without pre-trained knowledge.

read the letter

The main point is that this system combines Ghidra decompilation with an LLM to pull C2 protocol details from IoT malware binaries and then writes a pseudo-server that can wake up the sample and trigger its attack code.

What stands out is the end-to-end automation: the paper reports extracting all 20 core protocol elements correctly on Mirai and getting seven of ten DDoS vectors to run with behavior that matches the real C2. The test on an author-modified Mirai variant is a direct attempt to show the method is not just recalling known code.

The practical payoff is clear for analysts who need to study dormant IoT bots. If the extraction step holds, it removes a common blocker in dynamic analysis.

The soft spots are in the supporting evidence. The abstract gives no count of samples tested, no description of how ground truth was built, and no results outside the Mirai family. The claim that the LLM works purely from binary structures on the customized variant rests on the assumption that the source changes removed all recognizable patterns, but the paper supplies no list of what was actually altered. That leaves room for the LLM to still match residual motifs from its training data, exactly as the stress-test note flags. Controls for hallucination are also absent.

This is aimed at the narrow group doing IoT malware dynamic analysis. Those readers could try the pipeline on their own samples even with the current gaps. The work is coherent on its own terms and tackles a real tooling problem, so it deserves a serious referee rather than a desk reject. The authors will need to add sample counts, ground-truth methods, modification details, and at least one non-Mirai family in any revision.

Referee Report

2 major / 0 minor

Summary. The paper proposes a system combining Ghidra with an LLM to extract communication protocol specifications from IoT malware binaries and automatically generate pseudo-C2 servers for dynamic analysis of dormant samples. Experiments on Mirai report 100% accuracy extracting all 20 core protocol elements matching ground truth and full reproduction of 7/10 DDoS attack vectors with consistent behavior. End-to-end success on a customized Mirai variant (source modified then recompiled) is presented as evidence that the LLM infers specifications from binary control structures without relying on pre-trained knowledge.

Significance. If the central empirical claims are substantiated with detailed methodology and controls, the work could meaningfully extend LLM applications in malware analysis from assistance to automated construction of dynamic analysis environments, addressing the challenge of short-lived C2 infrastructure in IoT botnets.

major comments (2)

[Abstract] Abstract: the 100% specification extraction accuracy and 7/10 attack reproduction are presented without any information on how ground truth was established, the number of samples tested, error rates on non-Mirai families, or controls for LLM hallucination.
[Abstract] Abstract: the claim that success on the customized Mirai variant demonstrates inference without pre-trained knowledge is load-bearing for the novelty argument, yet no details are supplied on the scope of source modifications (e.g., protocol fields, control flows, or strings altered) or control experiments with unrelated IoT malware families.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful comments on the abstract. We agree that additional methodological context is needed to substantiate the claims and will revise the abstract and related sections accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the 100% specification extraction accuracy and 7/10 attack reproduction are presented without any information on how ground truth was established, the number of samples tested, error rates on non-Mirai families, or controls for LLM hallucination.

Authors: We agree that the abstract lacks sufficient detail on these points. We will revise the abstract to briefly state that ground truth was established via manual comparison to the publicly available Mirai source code and protocol documentation, that experiments used multiple Mirai samples, that the evaluation was limited to the Mirai family (hence no non-Mirai error rates), and that hallucination was addressed through cross-verification with Ghidra decompilation outputs and repeated LLM queries with consistency checks. revision: yes
Referee: [Abstract] Abstract: the claim that success on the customized Mirai variant demonstrates inference without pre-trained knowledge is load-bearing for the novelty argument, yet no details are supplied on the scope of source modifications (e.g., protocol fields, control flows, or strings altered) or control experiments with unrelated IoT malware families.

Authors: We acknowledge that more detail is required to support this claim. We will revise the manuscript to describe the specific source modifications made to the public Mirai code (e.g., alterations to protocol fields, control flow changes, and string modifications). We did not conduct experiments on unrelated families because the customized variant was designed to isolate whether the LLM infers from binary structures rather than relying on prior Mirai knowledge; we will add an explicit discussion of this experimental design choice. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on direct measurements against ground truth

full rationale

The paper describes an LLM-assisted system for extracting protocol specs from binaries and generating pseudo-C2 servers, evaluated via direct experiments on Mirai (100% extraction of 20 elements, 7/10 attack vectors reproduced) and a source-modified variant. No equations, fitted parameters, or predictions appear; results are reported as empirical agreement with ground truth. The central claim that success on the variant shows inference 'without relying on pre-trained knowledge' is an interpretive conclusion from the experiment, not a self-definitional reduction or load-bearing self-citation. No patterns from the enumerated circularity kinds are present, and the derivation chain is self-contained as a system description plus measurements.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that binary control structures contain sufficient semantic information for an LLM to reconstruct a complete protocol specification without external knowledge.

axioms (1)

domain assumption LLM can semantically interpret binary control structures to extract protocol elements
Invoked to justify the 100% extraction accuracy on both original and customized Mirai.

pith-pipeline@v0.9.1-grok · 5721 in / 1323 out tokens · 27016 ms · 2026-06-26T14:03:45.300755+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 3 canonical work pages

[1]

NICTER Observation Report 2024

National Institute of Information and Communications Technology (NICT). NICTER Observation Report 2024. https://www.nicter.jp/report, 2025. (in Japanese)

2024
[2]

J. Gamblin. Mirai-Source-Code. https://github.com/jgamblin/Mirai-Source-Code, 2016. Accessed: Feb., 2026

2016
[3]

Gomes, E

D. Gomes, E. Felix, F. Aires, and M. Vieira. Static code analysis for iot security: A systematic literature review. ACM Computing Surveys, 58(3):1–47, 2025

2025
[4]

The circle of life: A large-scale study of the iot malware lifecycle

Omar Alrawi, Chaz Lever, Kevin Valakuzhy, Ryan Court, Kevin Snow, Fabian Monrose, and Manos Antonakakis. The circle of life: A large-scale study of the iot malware lifecycle. InProceedings of the 30th USENIX Security Symposium, pages 3505–3522, 2021

2021
[5]

Reverse engineering and observing an iot botnet, Aug

G DATA CyberDefense. Reverse engineering and observing an iot botnet, Aug. 2020. URL https://blog. gdatasoftware.com/2020/08/36243-reverse-engineering-and-observing-an-iot-botnet . Ac- cessed: Feb., 2026

2020
[6]

V-sandbox for dynamic analysis iot botnet.IEEE Access, 8:145768–145786, 2020

Hai-Viet Le and Quoc-Dung Ngo. V-sandbox for dynamic analysis iot botnet.IEEE Access, 8:145768–145786, 2020

2020
[7]

CnCHunter: An MITM-approach to Identify Live CnC Servers

Ali Davanian, Ahmad Darki, and Michalis Faloutsos. CnCHunter: An MITM-approach to Identify Live CnC Servers. Black Hat USA 2021 (Whitepaper), 2021. URL https://i.blackhat.com/USA21/ Wednesday-Handouts/us-21-CnCHunter-An-MITM-Approach-To-Identify-Live-CnC-Servers-wp. pdf

2021
[8]

C2Miner: Tricking IoT Malware into Revealing Live Command & Control Servers

Ali Davanian, Michalis Faloutsos, and Martina Lindorfer. C2Miner: Tricking IoT Malware into Revealing Live Command & Control Servers. InProceedings of the 19th ACM Asia Conference on Computer and Communications Security (ASIA CCS ’24), pages 112–127, 2024. doi:10.1145/3634737.3644992

work page doi:10.1145/3634737.3644992 2024
[9]

Towards an Automatic Generation of Low-Interaction Web Application Honeypots

Marius Musch, Martin Härterich, and Martin Johns. Towards an Automatic Generation of Low-Interaction Web Application Honeypots. InProceedings of the 13th International Conference on Availability, Reliabil- ity and Security (ARES ’18), pages 1–6, New York, NY , USA, 2018. Association for Computing Machinery. doi:10.1145/3230833.3230839

work page doi:10.1145/3230833.3230839 2018
[10]

Reconstructing c2 servers for remote access trojans with symbolic execution

Luca Borzacchiello, Emilio Coppa, Daniele Cono D’Elia, and Camil Demetrescu. Reconstructing c2 servers for remote access trojans with symbolic execution. In Shlomi Dolev, Danny Hendler, Sachin Lodha, and Moti Yung, editors,Cyber Security Cryptography and Machine Learning, pages 121–140, Cham, 2019. Springer International Publishing. ISBN 978-3-030-20951-3

2019
[11]

RIoTMAN: A Systematic Analysis of IoT Malware Behavior

Ahmad Darki and Michalis Faloutsos. RIoTMAN: A Systematic Analysis of IoT Malware Behavior. In Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT ’20), pages 169–182, 2020. doi:10.1145/3386367.3431317

work page doi:10.1145/3386367.3431317 2020
[12]

Jelodar, S

H. Jelodar, S. Bai, P. Hamedi, H. Mohammadian, R. Razavi-Far, and A. Ghorbani. Large language model (llm) for software security: Code analysis, malware analysis, reverse engineering. arXiv preprint arXiv: 2504.07137, 2025

arXiv 2025
[13]

Z. Li, S. Dutta, and M. Naik. Iris: Llm-assisted static analysis for detecting security vulnerabilities. InProceedings of the International Conference on Learning Representations (ICLR), 2025

2025
[14]

Fujii and R

S. Fujii and R. Yamagishi. Feasibility study for supporting static malware analysis using llm. arXiv preprint arXiv:2411.14905, 2024

arXiv 2024
[15]

Ghidramcp.https://github.com/LaurieWired/GhidraMCP, 2025

Laurie Wired. Ghidramcp.https://github.com/LaurieWired/GhidraMCP, 2025. Accessed: Feb., 2026

2025
[16]

The evolution of mirai botnet scans over a six-year period.Journal of Information Security and Applications, 79:103629, 2023

Antonio Affinito, Savio Zinno, Gennaro Stanco, Alessio Botta, and Giorgio Ventre. The evolution of mirai botnet scans over a six-year period.Journal of Information Security and Applications, 79:103629, 2023. 12

2023

[1] [1]

NICTER Observation Report 2024

National Institute of Information and Communications Technology (NICT). NICTER Observation Report 2024. https://www.nicter.jp/report, 2025. (in Japanese)

2024

[2] [2]

J. Gamblin. Mirai-Source-Code. https://github.com/jgamblin/Mirai-Source-Code, 2016. Accessed: Feb., 2026

2016

[3] [3]

Gomes, E

D. Gomes, E. Felix, F. Aires, and M. Vieira. Static code analysis for iot security: A systematic literature review. ACM Computing Surveys, 58(3):1–47, 2025

2025

[4] [4]

The circle of life: A large-scale study of the iot malware lifecycle

Omar Alrawi, Chaz Lever, Kevin Valakuzhy, Ryan Court, Kevin Snow, Fabian Monrose, and Manos Antonakakis. The circle of life: A large-scale study of the iot malware lifecycle. InProceedings of the 30th USENIX Security Symposium, pages 3505–3522, 2021

2021

[5] [5]

Reverse engineering and observing an iot botnet, Aug

G DATA CyberDefense. Reverse engineering and observing an iot botnet, Aug. 2020. URL https://blog. gdatasoftware.com/2020/08/36243-reverse-engineering-and-observing-an-iot-botnet . Ac- cessed: Feb., 2026

2020

[6] [6]

V-sandbox for dynamic analysis iot botnet.IEEE Access, 8:145768–145786, 2020

Hai-Viet Le and Quoc-Dung Ngo. V-sandbox for dynamic analysis iot botnet.IEEE Access, 8:145768–145786, 2020

2020

[7] [7]

CnCHunter: An MITM-approach to Identify Live CnC Servers

Ali Davanian, Ahmad Darki, and Michalis Faloutsos. CnCHunter: An MITM-approach to Identify Live CnC Servers. Black Hat USA 2021 (Whitepaper), 2021. URL https://i.blackhat.com/USA21/ Wednesday-Handouts/us-21-CnCHunter-An-MITM-Approach-To-Identify-Live-CnC-Servers-wp. pdf

2021

[8] [8]

C2Miner: Tricking IoT Malware into Revealing Live Command & Control Servers

Ali Davanian, Michalis Faloutsos, and Martina Lindorfer. C2Miner: Tricking IoT Malware into Revealing Live Command & Control Servers. InProceedings of the 19th ACM Asia Conference on Computer and Communications Security (ASIA CCS ’24), pages 112–127, 2024. doi:10.1145/3634737.3644992

work page doi:10.1145/3634737.3644992 2024

[9] [9]

Towards an Automatic Generation of Low-Interaction Web Application Honeypots

Marius Musch, Martin Härterich, and Martin Johns. Towards an Automatic Generation of Low-Interaction Web Application Honeypots. InProceedings of the 13th International Conference on Availability, Reliabil- ity and Security (ARES ’18), pages 1–6, New York, NY , USA, 2018. Association for Computing Machinery. doi:10.1145/3230833.3230839

work page doi:10.1145/3230833.3230839 2018

[10] [10]

Reconstructing c2 servers for remote access trojans with symbolic execution

Luca Borzacchiello, Emilio Coppa, Daniele Cono D’Elia, and Camil Demetrescu. Reconstructing c2 servers for remote access trojans with symbolic execution. In Shlomi Dolev, Danny Hendler, Sachin Lodha, and Moti Yung, editors,Cyber Security Cryptography and Machine Learning, pages 121–140, Cham, 2019. Springer International Publishing. ISBN 978-3-030-20951-3

2019

[11] [11]

RIoTMAN: A Systematic Analysis of IoT Malware Behavior

Ahmad Darki and Michalis Faloutsos. RIoTMAN: A Systematic Analysis of IoT Malware Behavior. In Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT ’20), pages 169–182, 2020. doi:10.1145/3386367.3431317

work page doi:10.1145/3386367.3431317 2020

[12] [12]

Jelodar, S

H. Jelodar, S. Bai, P. Hamedi, H. Mohammadian, R. Razavi-Far, and A. Ghorbani. Large language model (llm) for software security: Code analysis, malware analysis, reverse engineering. arXiv preprint arXiv: 2504.07137, 2025

arXiv 2025

[13] [13]

Z. Li, S. Dutta, and M. Naik. Iris: Llm-assisted static analysis for detecting security vulnerabilities. InProceedings of the International Conference on Learning Representations (ICLR), 2025

2025

[14] [14]

Fujii and R

S. Fujii and R. Yamagishi. Feasibility study for supporting static malware analysis using llm. arXiv preprint arXiv:2411.14905, 2024

arXiv 2024

[15] [15]

Ghidramcp.https://github.com/LaurieWired/GhidraMCP, 2025

Laurie Wired. Ghidramcp.https://github.com/LaurieWired/GhidraMCP, 2025. Accessed: Feb., 2026

2025

[16] [16]

The evolution of mirai botnet scans over a six-year period.Journal of Information Security and Applications, 79:103629, 2023

Antonio Affinito, Savio Zinno, Gennaro Stanco, Alessio Botta, and Giorgio Ventre. The evolution of mirai botnet scans over a six-year period.Journal of Information Security and Applications, 79:103629, 2023. 12

2023