arxiv: 2605.02812 · v1 · submitted 2026-05-04 · 💻 cs.CR

Recognition: unknown

Autonomous LLM Agent Worms: Cross-Platform Propagation, Automated Discovery and Temporal Re-Entry Defense

Mingming Zha , Xiaofeng Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-08 18:26 UTC · model grok-4.3

classification 💻 cs.CR

keywords LLM agentsworm propagationautonomous agentspersistent stateAI securitycross-platform attacksre-entry defensecontext injection

0 comments

The pith

Persistent state in LLM agents creates a pathway for autonomous worms that spread zero-click across platforms via context re-entry.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that autonomous LLM agents, because they run as long-lived processes with file-backed workspaces, memory stores, and scheduled tasks, can be turned into vectors for worm propagation. Attacker-written content placed in these persistent locations can be automatically pulled back into the model's decision context and trigger actions such as sending messages to other agents or exfiltrating data. To demonstrate the risk, the authors supply an automated code-analysis tool that maps file I/O paths to context-injection points and a payload generator that remains effective even after LLM summarization. Experiments on three production agent frameworks confirm successful zero-click, multi-hop spread, privilege escalation, and data theft without platform-specific changes. A set of defensive mechanisms is then shown to break the write-re-entry-action chain while leaving ordinary agent operation intact.

Core claim

Autonomous LLM agents expose a new class of persistent worm because attacker-controlled data can be written into workspaces or memory files, automatically reloaded through scheduled autoloading into the LLM decision context, and used to drive high-risk actions including cross-agent transmission. The authors supply SSCGV, a source-code graph analyzer that traces these flows and ranks injection points, and SRPO, a payload optimizer that produces content resilient to summarization and paraphrasing. On three production frameworks they achieve zero-click propagation, three-hop cross-platform transmission without adaptation, inter-agent privilege escalation, and data exfiltration. They also report

What carries the argument

temporal re-entry of attacker-influenced content from persistent agent state into the LLM decision context through autoloading of workspaces and memory files

If this is right

Zero-click autonomous propagation succeeds in current production agent frameworks.
Three-hop cross-platform transmission occurs without any platform-specific code changes.
Inter-agent privilege escalation and data exfiltration follow directly from the same re-entry path.
User-prompt carriers produce higher attack success rates than system-prompt carriers.
Read operations on untrusted persistent state constitute the primary integrity threat.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Future agent platforms will need to treat persistent memory as an untrusted input channel equivalent to external messages.
Similar re-entry risks are likely to appear in any long-running AI system that maintains file-backed state across sessions.
Security testing suites for agents should include persistent-state scenarios as a standard check.
Capability attenuation after external reads could become a default design pattern for multi-agent systems.

Load-bearing premise

Attacker-controlled content placed in persistent workspaces, memory files, or scheduled tasks will be automatically reloaded into the LLM decision context and cause the agent to perform high-risk actions without sanitization or user intervention.

What would settle it

Run one agent that writes a command into a shared memory file, allow a second agent to autoload that file on its next cycle, and observe whether the second agent executes the command and contacts an external server without any user prompt or approval.

Figures

Figures reproduced from arXiv: 2605.02812 by Mingming Zha, Xiaofeng Wang.

**Figure 1.** Figure 1: Framework overview for analyzing and defending against persistent worm propagation in view at source ↗

read the original abstract

Autonomous LLM agents operate as long-running processes with persistent workspaces, memory files, scheduled task state, and messaging integrations. These features create a new propagation risk: attacker-influenced content can be written into persistent agent state, re-enter the LLM decision context through scheduled autoloading, and drive high-risk actions including configuration changes and cross-agent transmission. We present the first systematic framework for automated analysis of persistent worm propagation in file-backed multi-agent LLM ecosystems. SSCGV, our automated source-code graph analyzer, traces data flow from file I/O to LLM context injection points and ranks carriers by context injection position without manual analysis. SRPO, our summary-resilient payload optimizer, generates worm payloads robust to LLM-mediated summarization and paraphrasing across multi-hop communication. Evaluated on three production agent frameworks, we demonstrate zero-click autonomous propagation, 3-hop cross-platform transmission without platform-specific adaptation, inter-agent privilege escalation, and data exfiltration. We identify two empirical insights: user prompt carriers achieve higher attack compliance than system prompt carriers, and read operations represent the primary integrity threat in LLM-mediated systems. To defend against this class of attacks, we develop RTW-A, proven under a formal No Persistent Worm Propagation theorem. RTW blocks write-before-exposed-read re-entry; sealed configuration protects static files; typed memory promotion prevents untrusted summaries from entering trusted memory; and capability attenuation limits high-risk actions after external reads. These mechanisms eliminate the persistence, re-entry, action chain while preserving ordinary workflows. Affected systems are anonymized pending coordinated disclosure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows that persistent-state LLM agents can host zero-click worms and gives both an automated discovery method plus a defense with a formal theorem.

read the letter

The main point is that file-backed LLM agents create a workable worm vector through scheduled state reloads, and the authors have automated the discovery and payload steps while proposing a defense they claim is theorem-backed. SSCGV builds a data-flow graph from agent source code to rank file I/O paths that reach the LLM context. SRPO then tunes payloads so they survive summarization and paraphrasing over multiple hops. They ran both on three production frameworks and report successful zero-click spread, three-hop cross-platform transmission, privilege escalation, and exfiltration without per-platform tweaks. Two observations stand out: user-prompt carriers beat system-prompt ones for compliance, and read operations are the main integrity hole. The RTW-A defense tries to break the chain by blocking write-before-exposed-read, sealing static files, using typed memory promotion, and attenuating capabilities after external reads. They state a No Persistent Worm Propagation theorem for it. The theorem is tied to their model of state loading, so it may not cover every agent implementation or future summarizer behavior. The experiments are presented as concrete demonstrations, but the abstract and stress-test note leave open how detailed the metrics, failure cases, or code release are. That makes full reproducibility harder to judge without the supplement. People building or securing long-running LLM agents will find the attack paths and the read/write distinction useful. The work is internally consistent and engages the problem directly rather than just speculating, so it deserves a serious referee even if the theorem needs careful checking on assumptions.

Referee Report

2 major / 3 minor

Summary. The paper introduces SSCGV, an automated source-code graph analyzer that traces data flows from file I/O to LLM context injection points in agent frameworks, and SRPO, a summary-resilient payload optimizer for generating worm payloads robust to LLM summarization. It reports evaluations on three production agent frameworks demonstrating zero-click autonomous propagation, 3-hop cross-platform transmission without adaptation, inter-agent privilege escalation, and data exfiltration. Two empirical insights are identified regarding prompt carriers and read operations. The paper proposes RTW-A defenses (blocking write-before-exposed-read, sealed configs, typed memory promotion, capability attenuation) backed by a formal No Persistent Worm Propagation theorem that eliminates the persistence-re-entry-action chain while preserving workflows.

Significance. If the empirical demonstrations and theorem hold, this work is significant for highlighting a novel persistent-state propagation vector in autonomous LLM agents, providing the first automated analysis tools for such risks, and delivering practical, workflow-preserving mitigations with formal grounding. The cross-platform, zero-click results and explicit handling of summarization resilience represent concrete advances that could inform secure agent design.

major comments (2)

§3 (Evaluation): The central empirical claims of zero-click propagation and 3-hop cross-platform success on three frameworks are load-bearing, yet the manuscript provides no quantitative metrics (success rates, trial counts, failure modes), error analysis, or platform-specific details (even anonymized), making independent verification or assessment of robustness impossible from the reported text.
§4 (RTW-A and Theorem): The No Persistent Worm Propagation theorem is presented as proving the defense, but no proof sketch, key assumptions (e.g., on LLM summarization or state re-loading), or reduction steps are given in the main text. This is load-bearing for the formal contribution and the claim that RTW-A eliminates the attack chain.

minor comments (3)

Abstract and §2: The two empirical insights (user prompt carriers vs. system prompt; read operations as primary threat) are stated without supporting data or statistical significance, which should be tied explicitly to the evaluation results for clarity.
The manuscript should add a limitations section addressing potential bypasses of RTW-A mechanisms or assumptions about persistent workspace behavior in future LLM versions.
Notation for SSCGV data-flow ranking and SRPO optimization could be clarified with pseudocode or a small example to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for greater detail in the empirical evaluation and formal theorem. We address each point below and will incorporate revisions to enhance verifiability and clarity while preserving the manuscript's core contributions.

read point-by-point responses

Referee: §3 (Evaluation): The central empirical claims of zero-click propagation and 3-hop cross-platform success on three frameworks are load-bearing, yet the manuscript provides no quantitative metrics (success rates, trial counts, failure modes), error analysis, or platform-specific details (even anonymized), making independent verification or assessment of robustness impossible from the reported text.

Authors: We agree that the main text presents summarized results without full quantitative breakdowns. The evaluation section reports consistent success across repeated trials on each of the three frameworks, including specific trial counts and observed failure modes (primarily summarization-induced payload degradation, mitigated by SRPO). To enable independent assessment, we will add explicit success rates (e.g., 100% zero-click propagation in N trials per framework), anonymized platform details, and an error analysis appendix in the revision. revision: yes
Referee: §4 (RTW-A and Theorem): The No Persistent Worm Propagation theorem is presented as proving the defense, but no proof sketch, key assumptions (e.g., on LLM summarization or state re-loading), or reduction steps are given in the main text. This is load-bearing for the formal contribution and the claim that RTW-A eliminates the attack chain.

Authors: The full proof appears in the appendix, but we concur that a self-contained sketch is needed in the main text. We will insert a concise outline listing key assumptions (bounded context windows, deterministic file re-loading, and summarization as a lossy but non-inverting operation) and reduction steps showing how each RTW-A mechanism (write-before-exposed-read blocking, sealed configs, typed promotion, capability attenuation) severs the persistence-re-entry-action chain while preserving standard workflows. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper introduces SSCGV for automated data-flow tracing from file I/O to LLM context injection and SRPO for generating summary-resilient payloads, then reports empirical results on three production agent frameworks showing zero-click propagation and multi-hop transmission. The RTW-A defense mechanisms (write-before-exposed-read blocking, sealed configuration, typed memory promotion, capability attenuation) are presented as blocking the persistence-re-entry-action chain and are stated to be proven under a No Persistent Worm Propagation theorem defined from the paper's own model of agent workspaces, memory files, and scheduled tasks. No quoted step reduces by construction to a fitted input, self-definition, or self-citation chain; the theorem and mechanisms are derived internally from the described state model without renaming known results or importing uniqueness from prior author work as an external fact. The central claims rest on the reported experiments and formal argument rather than circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Central claims rest on the domain assumption that LLM agents maintain persistent, attacker-writable state that is automatically re-injected into decision context; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption LLM agents operate as long-running processes with persistent workspaces, memory files, scheduled task state, and messaging integrations that allow external content to re-enter decision context.
Explicitly stated in the opening of the abstract as the source of the new propagation risk.

pith-pipeline@v0.9.0 · 5581 in / 1292 out tokens · 33683 ms · 2026-05-08T18:26:37.143785+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MemLineage: Lineage-Guided Enforcement for LLM Agent Memory
cs.CR 2026-05 conditional novelty 6.0

MemLineage enforces untrusted-path persistence in LLM agent memory through Merkle logs, per-principal signatures, and max-of-strong-edges lineage propagation, achieving zero ASR on three poisoning workloads with sub-m...

Reference graph

Works this paper leans on

8 extracted references · 6 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

(2024).Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications.InProc

Cohen, Stav and Bitton, Ron and Nassi, Ben. Here comes the ai worm: Unleashing zero-click worms that target genai-powered applications.arXiv preprint arXiv:2403.02817, 2024

work page arXiv 2024
[2]

Clawworm: Self-propagating attacks across llm agent ecosystems

Zhang, Yihao and Wei, Zeming and Luan, Xiaokun and Wu, Chengcan and Zhang, Zhixin and Wu, Jiangrong and Wu, Haolin and Chen, Huanran and Sun, Jun and Sun, Meng. ClawWorm: Self-Propagating Attacks Across LLM Agent Ecosystems.arXiv preprint arXiv:2603.15727, 2026. 20

work page arXiv 2026
[3]

Agentpoison: Red-teaming llm agents via poisoning memory or knowledge bases

Chen, Zhaorun and Xiang, Zhen and Xiao, Chaowei and Song, Dawn and Li, Bo. Agentpoison: Red-teaming llm agents via poisoning memory or knowledge bases. InAdvances in Neural Information Processing Systems, 2024

2024
[4]

Available: https://arxiv.org/abs/2503.03704

Dong, Shen and Xu, Shaochen and He, Pengfei and Li, Yige and Tang, Jiliang and Liu, Tianming and Liu, Hui and Xiang, Zhen. A practical memory injection attack against llm agents.arXiv preprint arXiv:2503.03704, 2025

work page arXiv 2025
[5]

Lee and A

Lee, Donghyun and Tiwari, Mo. Prompt infection: Llm-to-llm prompt injection within multi-agent systems.arXiv preprint arXiv:2410.07283, 2024

work page arXiv 2024
[6]

S. Chen, J. Piet, C. Sitawarin, and D. Wagner. StruQ: Defending against prompt injection with structured queries.arXiv preprint arXiv:2402.06363, 2024

work page arXiv 2024
[7]

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

E. Wallace, K. Xiao, R. Leike, L. Weng, J. Heidecke, and A. Beutel. The instruction hierarchy: Training LLMs to prioritize privileged instructions.arXiv preprint arXiv:2404.13208, 2024

work page internal anchor Pith review arXiv 2024
[8]

N. F. Liu, K. Lin, J. Hewitt, A. Paranjape, M. Bevilacqua, F. Petroni, and P. Liang. Lost in the middle: How language models use long contexts.Transactions of the Association for Computational Linguistics, 2024. 21

2024