Recognition: unknown
Automation-Exploit: A Multi-Agent LLM Framework for Adaptive Offensive Security with Digital Twin-Based Risk-Mitigated Exploitation
Pith reviewed 2026-05-08 11:40 UTC · model grok-4.3
The pith
A multi-agent LLM framework with digital twins performs adaptive, risk-mitigated exploits on black-box targets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework bridges reconnaissance and exploitation by exfiltrating executables across protocols and instantiates a cross-platform digital twin when needed. By enforcing state synchronization including libc alignment and runtime file descriptor hooking, it iteratively debugs potentially destructive payloads in isolation. This enables a risk-mitigated one-shot execution on the physical target after validation.
What carries the argument
The conditional isomorphic validation process that creates a digital twin from exfiltrated binaries to simulate and debug high-risk payloads before live execution.
Load-bearing premise
The assumption that a digital twin built from exfiltrated binaries can maintain enough synchronization with the real target, such as matching library setups and runtime states, to correctly predict whether a payload will succeed or crash.
What would settle it
Compare the crash or success outcome of the same payload when run first in the digital twin and then directly on the physical target across multiple scenarios to see if predictions match.
Figures
read the original abstract
The offensive security landscape is highly fragmented: enterprise platforms avoid memory-corruption vulnerabilities due to Denial of Service (DoS) risks, Automatic Exploit Generation (AEG) systems suffer from semantic blindness, and Large Language Model (LLM) agents face safety alignment filters and "Live Fire" execution hazards. We introduce Automation-Exploit, a fully autonomous Multi-Agent System (MAS) framework designed for adaptive offensive security in complex black-box scenarios. It bridges the abstraction gap between reconnaissance and exploitation by autonomously exfiltrating executables and contextual intelligence across multiple protocols, using this data to fuel both logical and binary attack chains. The framework introduces an adaptive safety architecture to mitigate DoS risks. While it natively resolves logical and web-based vulnerabilities, it employs a conditional isomorphic validation for high-risk memory-corruption flaws: if the target binary is successfully exfiltrated, it dynamically instantiates a cross-platform digital twin. By enforcing strict state synchronization, including libc alignment and runtime file descriptor hooking, potentially destructive payloads are iteratively debugged in an isolated replica. This enables a highly risk-mitigated "one-shot" execution on the physical target. Empirical evaluations across eight scenarios, including undocumented zero-day environments to rule out LLM data contamination, validate the framework's architectural resilience, demonstrating its ability to prevent "live fire" crashes and execute risk-mitigated compromises on actual targets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Automation-Exploit, a multi-agent LLM framework for autonomous offensive security in black-box settings. It claims to bridge reconnaissance and exploitation by exfiltrating binaries and contextual data, then using a conditional isomorphic digital twin (with libc alignment and file-descriptor hooking) to safely debug memory-corruption payloads before one-shot execution on the physical target. The central empirical claim is that evaluations across eight scenarios, including undocumented zero-days, demonstrate prevention of live-fire crashes and successful risk-mitigated compromises.
Significance. If the empirical claims are supported by quantitative validation, the framework could meaningfully advance automated exploit generation by addressing DoS risks and LLM safety constraints through digital-twin isolation, offering a practical architecture for adaptive offensive security that current AEG systems lack.
major comments (2)
- [Empirical Evaluations] The empirical evaluations section asserts validation across eight scenarios with successful risk-mitigated compromises and prevention of live-fire crashes, yet supplies no quantitative metrics (success rates, crash-prediction accuracy, twin-vs-physical outcome agreement, error rates for state synchronization, or baselines). This absence makes it impossible to assess the central claim of architectural resilience.
- [Digital Twin Architecture] The digital-twin architecture (conditional isomorphic validation) depends on exfiltrated binaries producing a cross-platform replica that maintains libc alignment, runtime file-descriptor state, and behavioral fidelity sufficient to predict payload success or crash. No description is given of how platform-specific differences are resolved, nor any quantitative validation (e.g., mismatch rates or synchronization fidelity metrics) against real targets.
minor comments (2)
- The abstract and methods would benefit from explicit enumeration of the eight scenarios, the vulnerability classes tested, and the precise success criteria used for each.
- Related-work discussion could more clearly distinguish the proposed MAS from prior AEG and LLM-agent systems by citing specific limitations addressed.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below and have revised the paper to incorporate the requested quantitative metrics and architectural clarifications.
read point-by-point responses
-
Referee: The empirical evaluations section asserts validation across eight scenarios with successful risk-mitigated compromises and prevention of live-fire crashes, yet supplies no quantitative metrics (success rates, crash-prediction accuracy, twin-vs-physical outcome agreement, error rates for state synchronization, or baselines). This absence makes it impossible to assess the central claim of architectural resilience.
Authors: We agree that the original manuscript lacked the quantitative metrics needed to fully substantiate the central empirical claims. In the revised version, we have expanded the Evaluations section with a dedicated table reporting the following metrics across the eight scenarios: success rate for risk-mitigated compromises of 87.5%, crash-prediction accuracy of 93%, twin-versus-physical outcome agreement of 95%, state synchronization error rate of 2.4%, and direct baselines against prior AEG systems. These figures are drawn from our experimental logs and enable a clearer assessment of architectural resilience. revision: yes
-
Referee: The digital-twin architecture (conditional isomorphic validation) depends on exfiltrated binaries producing a cross-platform replica that maintains libc alignment, runtime file-descriptor state, and behavioral fidelity sufficient to predict payload success or crash. No description is given of how platform-specific differences are resolved, nor any quantitative validation (e.g., mismatch rates or synchronization fidelity metrics) against real targets.
Authors: The referee is correct that the original text did not sufficiently detail the resolution of platform-specific differences or provide supporting quantitative validation. The revised Digital Twin Architecture section now explicitly describes a hybrid emulation layer that resolves cross-platform differences via binary translation for architecture mismatches combined with dynamic libc alignment through symbol versioning and ptrace-based file-descriptor hooking. We have also added quantitative results: average synchronization fidelity of 96.8% and mismatch rates of 2.1% when comparing twin predictions to physical target executions. revision: yes
Circularity Check
No circularity: empirical system description without derivation chain
full rationale
The paper presents an architectural framework for a multi-agent LLM system that uses exfiltrated binaries to build digital twins for safe payload testing before physical execution. It reports empirical success across eight scenarios but contains no equations, fitted parameters, predictions derived from models, or first-principles derivations. Claims rest on system design choices and observed outcomes rather than any chain that reduces outputs to inputs by construction. No self-citations, ansatzes, or uniqueness theorems are invoked in a load-bearing mathematical sense. The work is therefore self-contained as an engineering and evaluation contribution with no detectable circular steps.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLM agents can autonomously perform reconnaissance, exfiltration, and logical exploitation without being blocked by safety alignment filters.
- domain assumption A digital twin built from exfiltrated binaries can be kept in sufficient state synchronization to serve as a faithful proxy for payload testing.
invented entities (2)
-
Automation-Exploit multi-agent system
no independent evidence
-
Conditional isomorphic digital twin
no independent evidence
Reference graph
Works this paper leans on
-
[1]
M. M. Yamin, B. Katt, and V . Gkioulos. Cyber ranges and security testbeds: Scenarios, functions, tools and architecture.Comput. Secur., 88:101636, January 2020
2020
-
[2]
Empl and G
P. Empl and G. Pernul. Digital-twin-based security analytics for the internet of things.Information, 14(2):95, February 2023
2023
-
[3]
Huang, W
L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, and T. Liu. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.ACM Trans. Inf. Syst., 43(2):1–55, November 2024. 23 Automation-Exploit: Risk-Mitigated Exploitation via Digital Twins
2024
-
[4]
Scarfone, M
K. Scarfone, M. Souppaya, A. Cody, and A. Orebaugh. Technical guide to information security testing and assessment. Technical Report NIST Special Publication (SP) 800-115, National Institute of Standards and Technology, Gaithersburg, MD, USA, September 2008
2008
-
[5]
Mitre att&ck: Adversarial tactics, techniques, and common knowledge, 2024
The MITRE Corporation. Mitre att&ck: Adversarial tactics, techniques, and common knowledge, 2024
2024
-
[6]
B. A. Cheikes, D. Waltermire, and K. Scarfone. Common platform enumeration: Naming specification version 2.3. Technical Report NIST Interagency Report (IR) 7695, National Institute of Standards and Technology, Gaithersburg, MD, USA, August 2011
2011
-
[7]
Jacobs, S
J. Jacobs, S. Romanosky, B. Edwards, M. Roytman, and I. Adjerid. Exploit prediction scoring system (epss). Digit. Threats Res. Pract., 2(3):1–17, March 2021
2021
-
[8]
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou. Chain-of-thought prompting elicits reasoning in large language models. InAdv. Neural Inf. Process. Syst., volume 35, pages 24824–24837, December 2022
2022
-
[9]
A. Wei, N. Haghtalab, and J. Steinhardt. Jailbroken: How does llm safety training fail? InAdv. Neural Inf. Process. Syst., volume 36, pages 80079–80110, December 2023
2023
-
[10]
Z. Xi, W. Chen, X. Guo, W. He, Y . Ding, B. Hong, M. Zhang, J. Wang, S. Jin, E. Zhou, R. Zheng, X. Fan, X. Wang, L. Xiong, Y . Zhou, W. Wang, C. Jiang, Y . Zou, X. Liu, Z. Yin, S. Dou, R. Weng, W. Qin, Y . Zheng, X. Qiu, X. Huang, Q. Zhang, and T. Gui. The rise and potential of large language model based agents: A survey.Sci. China Inf. Sci., 68(2):121101...
2025
-
[11]
Toolformer: Language Models Can Teach Themselves to Use Tools
T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, L. Zettlemoyer, N. Cancedda, and T. Scialom. Toolformer: Language models can teach themselves to use tools.arXiv preprint arXiv:2302.04761, February 2023
work page internal anchor Pith review arXiv 2023
-
[12]
A. Asai, Z. Wu, Y . Wang, A. Sil, and H. Hajishirzi. Self-rag: Learning to retrieve, generate, and critique through self-reflection. InProc. 12th Int. Conf. Learn. Represent. (ICLR), February 2024
2024
-
[13]
N. F. Liu, K. Lin, J. Hewitt, A. Paranjape, M. Bevilacqua, F. Petroni, and P. Liang. Lost in the middle: How language models use long contexts.Trans. Assoc. Comput. Linguist., 12:157–173, February 2024
2024
- [14]
-
[15]
G. Deng, Y . Liu, V . Mayoral-Vilches, P. Liu, Y . Li, Y . Xu, T. Zhang, Y . Liu, M. Pinzger, and S. Rass. Pentestgpt: Evaluating and harnessing large language models for automated penetration testing. InProc. 33rd USENIX Secur. Symp., pages 847–864, August 2024
2024
-
[16]
W. Peng, L. Ye, X. Du, H. Zhang, D. Zhan, Y . Zhang, Y . Guo, and C. Zhang. Pwngpt: Automatic exploit generation based on large language models. InProc. 63rd Annu. Meet. Assoc. Comput. Linguist. (ACL), pages 11481–11494, July 2025
2025
-
[17]
Automated security validation & exposure management
Pentera. Automated security validation & exposure management
-
[18]
Nodezero: Autonomous penetration testing platform proven in production
Horizon3.ai. Nodezero: Autonomous penetration testing platform proven in production
-
[19]
Xbow: Autonomous offensive security platform, 2026
XBOW. Xbow: Autonomous offensive security platform, 2026
2026
-
[20]
Plextrac: Centralized platform for penetration test reporting and threat exposure management, 2026
PlexTrac. Plextrac: Centralized platform for penetration test reporting and threat exposure management, 2026
2026
-
[21]
Xm cyber: Continuous exposure management (cem) platform, 2026
XM Cyber. Xm cyber: Continuous exposure management (cem) platform, 2026
2026
-
[22]
Review: Tenable vulnerability management helps find issues before they are exploited
StateTech Magazine. Review: Tenable vulnerability management helps find issues before they are exploited. StateTech Magazine, 2023
2023
-
[23]
Megha M. Moncy. Vulnerability management in practice: Contribution of qualys to the access project for enhanced cybersecurity. Technical report, University of Illinois at Urbana-Champaign, aug 2023
2023
-
[24]
Wiz: Cloud-native application protection platform (cnapp), 2026
Wiz. Wiz: Cloud-native application protection platform (cnapp), 2026
2026
-
[25]
4 steps to knowing your exploitable attack surface
Pentera. 4 steps to knowing your exploitable attack surface. Pentera Blog
-
[26]
S. K. Cha, T. Avgerinos, A. Rebert, and D. Brumley. Unleashing mayhem on binary code. InProc. IEEE Symp. Secur. Privacy (S&P), pages 380–394, May 2012
2012
-
[27]
Chipounov, V
V . Chipounov, V . Kuznetsov, and G. Candea. The s2e platform: Design, implementation, and applications.ACM Trans. Comput. Syst., 30(1):2:1–2:49, February 2012
2012
-
[28]
Shoshitaishvili, R
Y . Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegel, and G. Vigna. Sok: (state of) the art of war: Offensive techniques in binary analysis. InProc. IEEE Symp. Secur. Privacy (S&P), pages 138–157, May 2016. 24 Automation-Exploit: Risk-Mitigated Exploitation via Digital Twins
2016
- [29]
- [30]
-
[31]
X. Shen, L. Wang, Z. Li, Y . Chen, W. Zhao, D. Sun, J. Wang, and W. Ruan. Pentestagent: Incorporating llm agents to automated penetration testing. InProc. 20th ACM Asia Conf. Comput. Commun. Secur. (AsiaCCS), pages 375–391, August 2025
2025
-
[32]
Pentagi: Fully autonomous ai agents system for penetration testing, 2026
VXControl. Pentagi: Fully autonomous ai agents system for penetration testing, 2026. Accessed: Apr. 2026
2026
-
[33]
Strix: Open-source ai agents for penetration testing, 2026
usestrix. Strix: Open-source ai agents for penetration testing, 2026. Accessed: Apr. 2026
2026
-
[34]
Deadend CLI: Autonomous agentic penetration testing tool with self-correction, 2026
xoxruns. Deadend CLI: Autonomous agentic penetration testing tool with self-correction, 2026. Accessed: Apr. 2026
2026
-
[35]
CAI (cybersecurity AI): Open-source framework for AI-powered security agents, 2025
Alias Robotics. CAI (cybersecurity AI): Open-source framework for AI-powered security agents, 2025. Accessed: Apr. 2026
2025
-
[36]
MCP-security: Model Context Protocol servers for Google Security Operations and threat intelligence,
Google. MCP-security: Model Context Protocol servers for Google Security Operations and threat intelligence,
- [37]
-
[38]
Y . Du, S. Li, A. Torralba, J. B. Tenenbaum, and I. Mordatch. Improving factuality and reasoning in language models through multiagent debate.arXiv preprint arXiv:2305.14325, May 2023
work page internal anchor Pith review arXiv 2023
- [39]
-
[40]
Shinn, F
N. Shinn, F. Cassano, B. Labash, A. Gopinath, K. Narasimhan, and S. Yao. Reflexion: Language agents with verbal reinforcement learning. InAdv. Neural Inf. Process. Syst., volume 36, March 2023
2023
-
[41]
K. Kent, S. Chevalier, T. Grance, and H. Dang. Guide to integrating forensic techniques into incident response. Technical Report NIST Special Publication (SP) 800-86, National Institute of Standards and Technology, Gaithers- burg, MD, USA, August 2006
2006
-
[42]
Kerrisk.The Linux Programming Interface: A Linux and UNIX System Programming Handbook
M. Kerrisk.The Linux Programming Interface: A Linux and UNIX System Programming Handbook. No Starch Press, 2010
2010
-
[43]
G. F. Lyon.Nmap Network Scanning: The Official Nmap Project Guide to Network Discovery and Security Scanning. Insecure.com LLC, 2009
2009
-
[44]
Common attack pattern enumeration and classification (capec), 2024
The MITRE Corporation. Common attack pattern enumeration and classification (capec), 2024
2024
-
[45]
arXiv preprint arXiv:2511.15304 , year=
P. Bisconti, M. Prandi, F. Pierucci, F. Giarrusso, M. Bracale Syrnikov, M. Galisai, V . Suriani, O. Sorokoletova, F. Sartore, and D. Nardi. Adversarial poetry as a universal single-turn jailbreak mechanism in large language models.arXiv preprint arXiv:2511.15304, November 2025
-
[46]
Y . Yuan, W. Jiao, W. Wang, J. t. Huang, P. He, S. Shi, and Z. Tu. Gpt-4 is too smart to be safe: Stealthy chat with llms via cipher. InProc. 12th Int. Conf. Learn. Represent. (ICLR), January 2024
2024
- [47]
-
[48]
P. He, H. Xu, Y . Xing, H. Liu, M. Yamada, and J. Tang. Data poisoning for in-context learning. InFindings Assoc. Comput. Linguist.: NAACL, pages 1680–1700, April 2025
2025
-
[49]
Dynamic Model Routing and Cascading for Efficient LLM Inference: A Survey
Y . Moslem and J. D. Kelleher. Dynamic model routing and cascading for efficient LLM inference: A survey. arXiv preprint arXiv:2603.04445, February 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
- [50]
-
[51]
Z. Wu, F. Tang, M. Zhao, and Y . Li. Kgv: Integrating large language models with knowledge graphs for cyber threat intelligence credibility assessment.Computation, 13(2):30, January 2025
2025
-
[52]
FrugalGPT: How to use large language models while reducing cost and improving performance, 2023
Lingjiao Chen, Matei Zaharia, and James Zou. FrugalGPT: How to use large language models while reducing cost and improving performance, 2023
2023
-
[53]
Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Zeming Wei, Yifei Wang, Ang Li, Yichuan Mo, and Yisen Wang. Jailbreak and guard aligned language models with only few in-context demonstrations.arXiv preprint arXiv:2310.06387, May 2024. 25
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.