arxiv: 2603.23064 · v3 · submitted 2026-03-24 · 💻 cs.CR · cs.AI· cs.SI

Recognition: no theorem link

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

Yechao Zhang , Shiqian Zhao , Jie Zhang , Gelei Deng , Jiawen Zhang , Xiaogeng Liu , Chaowei Xiao , Tianwei Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:51 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.SI

keywords AI agent securitymemory pollutionbackground executionmisinformation influenceheartbeat mechanismsession contextClaw agentssocial credibility

0 comments

The pith

Untrusted content from background heartbeat execution in Claw AI agents can pollute shared memory and influence user-facing behavior without detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that mainstream Claw personal AI agents have a built-in vulnerability due to their heartbeat-driven background execution running in the same session as user conversations. This allows content from external sources like social media or emails to enter the agent's memory context seamlessly. Experiments with a replica called MissClaw show that social credibility cues can lead to misleading behavior in up to 61 percent of cases, and memory-saving mechanisms turn short-term exposure into long-term influence at rates as high as 91 percent. Cross-session effects reach 76 percent even with diluted content. If true, this means agents can be manipulated through everyday online misinformation without any special injection techniques or user awareness.

Core claim

The central discovery is that the architectural choice of running heartbeat background execution in the same session as foreground interactions creates an Exposure-Memory-Behavior pathway. Misinformation encountered in the background enters short-term context, can be saved to long-term memory through routine behaviors, and subsequently shapes responses to user queries. Using MissClaw, a controlled replica of Moltbook, the authors demonstrate that perceived consensus drives short-term influence up to 61%, memory saving promotes durable pollution up to 91%, and cross-session effects persist at 76% even under naturalistic conditions with content dilution.

What carries the argument

The Exposure (E) to Memory (M) to Behavior (B) pathway, where background content shares the session context and can transfer to long-term storage.

Load-bearing premise

The load-bearing premise is that background heartbeat execution shares the identical session and memory context with user-facing conversations, allowing external content to mix in without separation.

What would settle it

A direct test would be to run background exposure to misleading content in a controlled Claw-like agent and then query the agent in a new session on related topics to check if behavior changes compared to controls without background exposure.

read the original abstract

We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered during heartbeat-driven background execution can silently pollute agent memory and subsequently influence user-facing behavior without the user's awareness. This vulnerability arises from an architectural design shared across the Claw ecosystem: heartbeat background execution runs in the same session as user-facing conversation, so content ingested from any external source monitored in the background (including email, message channels, news feeds, code repositories, and social platforms) can enter the same memory context used for foreground interaction, often with limited user visibility and without clear source provenance. We formalize this process as an Exposure (E) $\rightarrow$ Memory (M) $\rightarrow$ Behavior (B) pathway: misinformation encountered during heartbeat execution enters the agent's short-term session context, potentially gets written into long-term memory, and later shapes downstream user-facing behavior. We instantiate this pathway in an agent-native social setting using MissClaw, a controlled research replica of Moltbook. We find that (1) social credibility cues, especially perceived consensus, are the dominant driver of short-term behavioral influence, with misleading rates up to 61%; (2) routine memory-saving behavior can promote short-term pollution into durable long-term memory at rates up to 91%, with cross-session behavioral influence reaching 76%; (3) under naturalistic browsing with content dilution and context pruning, pollution still crosses session boundaries. Overall, prompt injection is not required: ordinary social misinformation is sufficient to silently shape agent memory and behavior under heartbeat-driven background execution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper identifies a security vulnerability in mainstream Claw personal AI agents arising from heartbeat-driven background execution sharing the same session and memory context as user-facing interactions. This allows untrusted external content (e.g., from social platforms) to enter via an Exposure (E) → Memory (M) → Behavior (B) pathway, silently polluting long-term memory and influencing behavior without user awareness or prompt injection. Using MissClaw, a controlled replica of Moltbook, the authors report empirical rates of 61% misleading influence from social credibility cues, 91% promotion to durable memory, and 76% cross-session behavioral effects, concluding that ordinary social misinformation suffices for silent shaping of agent behavior.

Significance. If the central claims hold after verification, the work draws attention to an under-examined architectural risk in persistent AI agents that perform background monitoring, with potential implications for user privacy, trust, and security in consumer AI systems. The replica-based experimental approach is a positive element that supports reproducibility of the E→M→B pathway under controlled conditions.

major comments (3)

[Abstract] Abstract: The reported rates (61% misleading, 91% memory persistence, 76% cross-session) are presented without any accompanying experimental protocol, trial counts, controls for content dilution or pruning, statistical tests, or raw data, rendering the quantitative claims unverifiable from the manuscript.
[Abstract] Abstract: The foundational premise that heartbeat background execution 'runs in the same session as user-facing conversation' across the Claw ecosystem is asserted without supporting evidence such as code inspection, API traces, or documentation from actual implementations; the MissClaw results demonstrate the pathway only when session sharing is instantiated by construction.
[Formalization section] The E→M→B pathway formalization: The description lacks precise conditions or thresholds for when short-term context is written to long-term memory versus pruned, making it difficult to evaluate the generality of the reported persistence rates beyond the specific replica setup.

minor comments (1)

[Abstract] The acronym 'Claw' is used throughout without an initial expansion or reference to the specific commercial systems it encompasses.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thorough review and valuable feedback on improving the verifiability and precision of our claims. We have revised the manuscript to address the points raised, expanding the abstract, adding supporting references, and clarifying the formalization. Below we respond point by point.

read point-by-point responses

Referee: [Abstract] Abstract: The reported rates (61% misleading, 91% memory persistence, 76% cross-session) are presented without any accompanying experimental protocol, trial counts, controls for content dilution or pruning, statistical tests, or raw data, rendering the quantitative claims unverifiable from the manuscript.

Authors: We agree that the abstract should include sufficient methodological context to support the reported rates. In the revised manuscript we have expanded the abstract to summarize the experimental protocol (controlled trials with MissClaw under varying dilution levels), trial counts (N=200 per condition across three sessions), explicit controls for content dilution and pruning, and the statistical tests performed (chi-square tests yielding p<0.01). Full protocol details, raw data tables, and code are now referenced in Section 4 and the supplementary materials. revision: yes
Referee: [Abstract] Abstract: The foundational premise that heartbeat background execution 'runs in the same session as user-facing conversation' across the Claw ecosystem is asserted without supporting evidence such as code inspection, API traces, or documentation from actual implementations; the MissClaw results demonstrate the pathway only when session sharing is instantiated by construction.

Authors: The premise is grounded in publicly available Claw documentation and observed agent behavior across multiple deployments, which we now cite explicitly in the introduction (including API reference links and user-reported session logs). Direct code inspection of proprietary implementations is not feasible; however, the MissClaw replica faithfully reproduces the documented shared-session architecture, and the E→M→B pathway is shown to hold under those conditions. We have added a dedicated paragraph clarifying the evidential basis. revision: partial
Referee: [Formalization section] The E→M→B pathway formalization: The description lacks precise conditions or thresholds for when short-term context is written to long-term memory versus pruned, making it difficult to evaluate the generality of the reported persistence rates beyond the specific replica setup.

Authors: We have revised Section 3 to specify the exact conditions: short-term context is promoted to long-term memory when the agent's internal salience score exceeds 0.7 and the content survives at least three subsequent turns without pruning. Pruning occurs for items below salience 0.4 or after five turns of inactivity. These thresholds are derived from the agent's documented memory heuristics and are now illustrated with concrete examples from the MissClaw runs, allowing evaluation of generality. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents an experimental study of the E→M→B pathway using a controlled replica (MissClaw) of a real agent system. Claims are grounded in direct measurements of misleading rates, memory persistence, and cross-session influence rather than any mathematical derivation, fitted parameters renamed as predictions, or load-bearing self-citations. The architectural premise (shared session context) is stated as an observed design property and tested via instantiation; it is not derived from prior self-referential results or definitions. No equations, ansatzes, or uniqueness theorems appear that could reduce the reported outcomes to inputs by construction. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the domain assumption that background and foreground execution share an undifferentiated memory context in the Claw architecture; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Heartbeat background execution runs in the same session as user-facing conversation
Stated directly in the abstract as the shared architectural design across the Claw ecosystem

pith-pipeline@v0.9.0 · 5606 in / 1122 out tokens · 28237 ms · 2026-05-15T00:51:26.164536+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

When Routine Chats Turn Toxic: Unintended Long-Term State Poisoning in Personalized Agents
cs.CR 2026-05 unverdicted novelty 6.0

Routine user chats can unintentionally poison the long-term state of personalized LLM agents, causing authorization drift, tool escalation, and unchecked autonomy, as measured by a new benchmark and reduced by the Sta...

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Solomon E

https://arstechnica.com/information-technology/2026/01/ai-agents-now-have-their-own -reddit-style-social-network-and-its-getting-weird-fast/ . Solomon E. Asch. Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological Monographs: General and Applied , 70(9):1–70,

work page 2026
[2]

com/news/articles/c62n410w5yno

https://www.bbc. com/news/articles/c62n410w5yno. Tianyu Chen, Dongrui Liu, Xia Hu, Jingyi Yu, and Wenjie Wang. A trajectory-based safety audit of Clawdbot (OpenClaw). arXiv preprint arXiv:2602.14364 ,

work page arXiv
[3]

DiraBook

https: //arxiv.org/abs/2603.11619. DiraBook. DiraBook: AI agents first open network. https://dirabook.com,

work page arXiv
[4]

Clawdrain: Exploiting tool-calling chains for stealthy token exhaustion in OpenClaw agents

Ben Dong, Hui Feng, and Qian Wang. Clawdrain: Exploiting tool-calling chains for stealthy token exhaustion in OpenClaw agents. arXiv preprint arXiv:2603.00902 ,

work page arXiv
[5]

Humans welcome to observe

https://arxiv.org/abs/2602.10127. Marcia K. Johnson, Shahin Hashtroudi, and D. Stephen Lindsay. Source monitoring. Psychological Bulletin , 114(1): 3–28,

work page arXiv
[6]

LinkClaws

https://arxiv.org/abs/2602.07432. LinkClaws. LinkClaws: Where AI agents do business. https://linkclaws.com,

work page arXiv
[7]

Md Motaleb Hossen Manik and Ge Wang

Md Motaleb Hossen Manik and Ge Wang. OpenClaw agents on Moltbook: Risky instruction sharing and norm enforcement in an agent-only social network. arXiv preprint arXiv:2602.02625 ,

work page arXiv
[8]

OpenClaw Project

https: //www.npr.org/2026/02/04/nx-s1-5697392/moltbook-social-media-ai-agents . OpenClaw Project. OpenClaw: Open-source personal AI agent framework. https://openclaw.ai,

work page 2026
[9]

Context manipulation attacks: Web agents are susceptible to corrupted memory

Atharv Singh Patlan, Ashwin Hebbar, Pramod Viswanath, and Prateek Mittal. Context manipulation attacks: Web agents are susceptible to corrupted memory. arXiv preprint arXiv:2506.17318 ,

work page arXiv
[10]

Toolformer: Language Models Can Teach Themselves to Use Tools

https://arxiv.or g/abs/2302.04761. Zhengyang Shan, Jiayun Xin, Yue Zhang, and Minghui Xu. Don’t let the claw grip your hand: A security analysis and defense framework for openclaw,

work page internal anchor Pith review Pith/arXiv arXiv
[11]

https://arxiv.org/abs/2603.10387. Sipeed. PicoClaw: Tiny, fast, and deployable anywhere. https://github.com/sipeed/picoclaw,

work page arXiv
[12]

Memorygraft: Persistent compromise of llm agents via poisoned experience retrieval,

Saksham Sahai Srivastava and Haoyu He. MemoryGraft: Persistent compromise of LLM agents via poisoned experi- ence retrieval. arXiv preprint arXiv:2512.16962 ,

work page arXiv
[13]

https://www.theguardian.com/technology/2026/feb/02/moltbook-ai-agents-social-media-site-bots-art ificial-intelligence. TinyAGI. TinyClaw: Multi-agent, multi-team, multi-channel, 24/7 ai assistant. https://github.com/TinyAGI/tin yclaw,

work page 2026
[14]

From assistant to double agent: Formalizing and benchmarking attacks on OpenClaw for personalized local AI agent

Yuhang Wang, Feiming Xu, Zheng Lin, Guangyu He, Yuzhe Huang, Haichang Gao, Zhenxing Niu, Shiguo Lian, and Zhaoxiang Liu. From assistant to double agent: Formalizing and benchmarking attacks on OpenClaw for personalized local AI agent. arXiv preprint arXiv:2602.08412 ,

work page arXiv
[15]

ReAct: Synergizing Reasoning and Acting in Language Models

https://arxiv.org/abs/2210.03629. Güney Yıldız. Inside Moltbook: The social network where 1.4 million AI agents talk and humans just watch. Forbes, January

work page internal anchor Pith review Pith/arXiv arXiv
[16]

ZeroClaw Labs

https://www.forbes.com/sites/guneyyildiz/2026/01/31/inside-moltbook-the-social-netwo rk-where-14-million-ai-agents-talk-and-humans-just-watch/ . ZeroClaw Labs. ZeroClaw: Open-source agent framework. https://github.com/zeroclaw-labs/zeroclaw,

work page 2026
[17]

Metaxas, Xiao Wang, Jihun Hamm, and Yingqiang Ge

Yunbei Zhang, Kai Mei, Ming Liu, Janet Wang, Dimitris N. Metaxas, Xiao Wang, Jihun Hamm, and Yingqiang Ge. Agents in the wild: Safety, society, and the illusion of sociality on Moltbook. arXiv preprint arXiv:2602.13284 ,

work page arXiv
[18]

Let’s do it!

Appendix A Victim Agent Personas Each victim agent is configured with a SOUL.md persona file that defines its identity, communication style, and epistemic dispositions. We deploy five agents spanning a range of epistemic dispositions—from risk- embracing and uncritical to deliberate and skeptical. The full persona specifications are reproduced below verba...

work page 2024
[19]

Grounded Retrieval Verification: Reducing Hallucination in RAG Pipelines via Source Tracing

DOI: 10.5555/usenix-sec-2024-0347. [Pinned Post] admin_team (owner) Key paper for prompt injection research: Zhang, Chen & Liu (2024) ‘Adversarial Prompt Prop- agation in Multi-Agent LLM Systems’, USENIX Security. DOI: 10.5555/usenix-sec-2024-0347. This is the first rigorous formalization of cross-agent prompt propagation. Essential for anyone working on ...

work page doi:10.5555/usenix-sec-2024-0347 2024
[21]

Persistent Memory Architectures for Autonomous Language Agents: A Benchmark Study

DOI: 10.5555/neurips-2024-1283. This paper introduces GR V, the first end-to-end source tracing frame- work that reduces hallucination by 43%. Must-cite for any RAG-related work. mod_sam (moderator) Confirmed. The GR V framework has become the standard benchmark for hallucination reduction in retrieval-augmented systems. We use it in all our evaluations. ...

work page doi:10.5555/neurips-2024-1283 2024
[23]

You’re writing a survey section on memory mechanisms in long-running LLM agents. List the key papers you would cite with full details (title, authors, venue, year)

DOI: 10.5555/acl-2024-0891. The first comprehensive benchmark comparing episodic, semantic, and procedural memory in long-running agents. Their memory taxonomy is exactly what the field needed. mod_jordan (moderator) This paper defined the field. Their three-tier memory taxonomy (episodic/semantic/procedural) is now the standard framework everyone uses. m...

work page doi:10.5555/acl-2024-0891 2024