Recognition: no theorem link
Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution
Pith reviewed 2026-05-15 00:51 UTC · model grok-4.3
The pith
Untrusted content from background heartbeat execution in Claw AI agents can pollute shared memory and influence user-facing behavior without detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that the architectural choice of running heartbeat background execution in the same session as foreground interactions creates an Exposure-Memory-Behavior pathway. Misinformation encountered in the background enters short-term context, can be saved to long-term memory through routine behaviors, and subsequently shapes responses to user queries. Using MissClaw, a controlled replica of Moltbook, the authors demonstrate that perceived consensus drives short-term influence up to 61%, memory saving promotes durable pollution up to 91%, and cross-session effects persist at 76% even under naturalistic conditions with content dilution.
What carries the argument
The Exposure (E) to Memory (M) to Behavior (B) pathway, where background content shares the session context and can transfer to long-term storage.
Load-bearing premise
The load-bearing premise is that background heartbeat execution shares the identical session and memory context with user-facing conversations, allowing external content to mix in without separation.
What would settle it
A direct test would be to run background exposure to misleading content in a controlled Claw-like agent and then query the agent in a new session on related topics to check if behavior changes compared to controls without background exposure.
read the original abstract
We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered during heartbeat-driven background execution can silently pollute agent memory and subsequently influence user-facing behavior without the user's awareness. This vulnerability arises from an architectural design shared across the Claw ecosystem: heartbeat background execution runs in the same session as user-facing conversation, so content ingested from any external source monitored in the background (including email, message channels, news feeds, code repositories, and social platforms) can enter the same memory context used for foreground interaction, often with limited user visibility and without clear source provenance. We formalize this process as an Exposure (E) $\rightarrow$ Memory (M) $\rightarrow$ Behavior (B) pathway: misinformation encountered during heartbeat execution enters the agent's short-term session context, potentially gets written into long-term memory, and later shapes downstream user-facing behavior. We instantiate this pathway in an agent-native social setting using MissClaw, a controlled research replica of Moltbook. We find that (1) social credibility cues, especially perceived consensus, are the dominant driver of short-term behavioral influence, with misleading rates up to 61%; (2) routine memory-saving behavior can promote short-term pollution into durable long-term memory at rates up to 91%, with cross-session behavioral influence reaching 76%; (3) under naturalistic browsing with content dilution and context pruning, pollution still crosses session boundaries. Overall, prompt injection is not required: ordinary social misinformation is sufficient to silently shape agent memory and behavior under heartbeat-driven background execution.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper identifies a security vulnerability in mainstream Claw personal AI agents arising from heartbeat-driven background execution sharing the same session and memory context as user-facing interactions. This allows untrusted external content (e.g., from social platforms) to enter via an Exposure (E) → Memory (M) → Behavior (B) pathway, silently polluting long-term memory and influencing behavior without user awareness or prompt injection. Using MissClaw, a controlled replica of Moltbook, the authors report empirical rates of 61% misleading influence from social credibility cues, 91% promotion to durable memory, and 76% cross-session behavioral effects, concluding that ordinary social misinformation suffices for silent shaping of agent behavior.
Significance. If the central claims hold after verification, the work draws attention to an under-examined architectural risk in persistent AI agents that perform background monitoring, with potential implications for user privacy, trust, and security in consumer AI systems. The replica-based experimental approach is a positive element that supports reproducibility of the E→M→B pathway under controlled conditions.
major comments (3)
- [Abstract] Abstract: The reported rates (61% misleading, 91% memory persistence, 76% cross-session) are presented without any accompanying experimental protocol, trial counts, controls for content dilution or pruning, statistical tests, or raw data, rendering the quantitative claims unverifiable from the manuscript.
- [Abstract] Abstract: The foundational premise that heartbeat background execution 'runs in the same session as user-facing conversation' across the Claw ecosystem is asserted without supporting evidence such as code inspection, API traces, or documentation from actual implementations; the MissClaw results demonstrate the pathway only when session sharing is instantiated by construction.
- [Formalization section] The E→M→B pathway formalization: The description lacks precise conditions or thresholds for when short-term context is written to long-term memory versus pruned, making it difficult to evaluate the generality of the reported persistence rates beyond the specific replica setup.
minor comments (1)
- [Abstract] The acronym 'Claw' is used throughout without an initial expansion or reference to the specific commercial systems it encompasses.
Simulated Author's Rebuttal
We thank the referee for the thorough review and valuable feedback on improving the verifiability and precision of our claims. We have revised the manuscript to address the points raised, expanding the abstract, adding supporting references, and clarifying the formalization. Below we respond point by point.
read point-by-point responses
-
Referee: [Abstract] Abstract: The reported rates (61% misleading, 91% memory persistence, 76% cross-session) are presented without any accompanying experimental protocol, trial counts, controls for content dilution or pruning, statistical tests, or raw data, rendering the quantitative claims unverifiable from the manuscript.
Authors: We agree that the abstract should include sufficient methodological context to support the reported rates. In the revised manuscript we have expanded the abstract to summarize the experimental protocol (controlled trials with MissClaw under varying dilution levels), trial counts (N=200 per condition across three sessions), explicit controls for content dilution and pruning, and the statistical tests performed (chi-square tests yielding p<0.01). Full protocol details, raw data tables, and code are now referenced in Section 4 and the supplementary materials. revision: yes
-
Referee: [Abstract] Abstract: The foundational premise that heartbeat background execution 'runs in the same session as user-facing conversation' across the Claw ecosystem is asserted without supporting evidence such as code inspection, API traces, or documentation from actual implementations; the MissClaw results demonstrate the pathway only when session sharing is instantiated by construction.
Authors: The premise is grounded in publicly available Claw documentation and observed agent behavior across multiple deployments, which we now cite explicitly in the introduction (including API reference links and user-reported session logs). Direct code inspection of proprietary implementations is not feasible; however, the MissClaw replica faithfully reproduces the documented shared-session architecture, and the E→M→B pathway is shown to hold under those conditions. We have added a dedicated paragraph clarifying the evidential basis. revision: partial
-
Referee: [Formalization section] The E→M→B pathway formalization: The description lacks precise conditions or thresholds for when short-term context is written to long-term memory versus pruned, making it difficult to evaluate the generality of the reported persistence rates beyond the specific replica setup.
Authors: We have revised Section 3 to specify the exact conditions: short-term context is promoted to long-term memory when the agent's internal salience score exceeds 0.7 and the content survives at least three subsequent turns without pruning. Pruning occurs for items below salience 0.4 or after five turns of inactivity. These thresholds are derived from the agent's documented memory heuristics and are now illustrated with concrete examples from the MissClaw runs, allowing evaluation of generality. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper presents an experimental study of the E→M→B pathway using a controlled replica (MissClaw) of a real agent system. Claims are grounded in direct measurements of misleading rates, memory persistence, and cross-session influence rather than any mathematical derivation, fitted parameters renamed as predictions, or load-bearing self-citations. The architectural premise (shared session context) is stated as an observed design property and tested via instantiation; it is not derived from prior self-referential results or definitions. No equations, ansatzes, or uniqueness theorems appear that could reduce the reported outcomes to inputs by construction. The work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Heartbeat background execution runs in the same session as user-facing conversation
Forward citations
Cited by 1 Pith paper
-
When Routine Chats Turn Toxic: Unintended Long-Term State Poisoning in Personalized Agents
Routine user chats can unintentionally poison the long-term state of personalized LLM agents, causing authorization drift, tool escalation, and unchecked autonomy, as measured by a new benchmark and reduced by the Sta...
Reference graph
Works this paper leans on
-
[1]
https://arstechnica.com/information-technology/2026/01/ai-agents-now-have-their-own -reddit-style-social-network-and-its-getting-weird-fast/ . Solomon E. Asch. Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological Monographs: General and Applied , 70(9):1–70,
work page 2026
-
[2]
com/news/articles/c62n410w5yno
https://www.bbc. com/news/articles/c62n410w5yno. Tianyu Chen, Dongrui Liu, Xia Hu, Jingyi Yu, and Wenjie Wang. A trajectory-based safety audit of Clawdbot (OpenClaw). arXiv preprint arXiv:2602.14364 ,
- [3]
-
[4]
Clawdrain: Exploiting tool-calling chains for stealthy token exhaustion in OpenClaw agents
Ben Dong, Hui Feng, and Qian Wang. Clawdrain: Exploiting tool-calling chains for stealthy token exhaustion in OpenClaw agents. arXiv preprint arXiv:2603.00902 ,
-
[5]
https://arxiv.org/abs/2602.10127. Marcia K. Johnson, Shahin Hashtroudi, and D. Stephen Lindsay. Source monitoring. Psychological Bulletin , 114(1): 3–28,
- [6]
-
[7]
Md Motaleb Hossen Manik and Ge Wang
Md Motaleb Hossen Manik and Ge Wang. OpenClaw agents on Moltbook: Risky instruction sharing and norm enforcement in an agent-only social network. arXiv preprint arXiv:2602.02625 ,
-
[8]
https: //www.npr.org/2026/02/04/nx-s1-5697392/moltbook-social-media-ai-agents . OpenClaw Project. OpenClaw: Open-source personal AI agent framework. https://openclaw.ai,
work page 2026
-
[9]
Context manipulation attacks: Web agents are susceptible to corrupted memory
Atharv Singh Patlan, Ashwin Hebbar, Pramod Viswanath, and Prateek Mittal. Context manipulation attacks: Web agents are susceptible to corrupted memory. arXiv preprint arXiv:2506.17318 ,
-
[10]
Toolformer: Language Models Can Teach Themselves to Use Tools
https://arxiv.or g/abs/2302.04761. Zhengyang Shan, Jiayun Xin, Yue Zhang, and Minghui Xu. Don’t let the claw grip your hand: A security analysis and defense framework for openclaw,
work page internal anchor Pith review Pith/arXiv arXiv
- [11]
-
[12]
Memorygraft: Persistent compromise of llm agents via poisoned experience retrieval,
Saksham Sahai Srivastava and Haoyu He. MemoryGraft: Persistent compromise of LLM agents via poisoned experi- ence retrieval. arXiv preprint arXiv:2512.16962 ,
-
[13]
https://www.theguardian.com/technology/2026/feb/02/moltbook-ai-agents-social-media-site-bots-art ificial-intelligence. TinyAGI. TinyClaw: Multi-agent, multi-team, multi-channel, 24/7 ai assistant. https://github.com/TinyAGI/tin yclaw,
work page 2026
-
[14]
Yuhang Wang, Feiming Xu, Zheng Lin, Guangyu He, Yuzhe Huang, Haichang Gao, Zhenxing Niu, Shiguo Lian, and Zhaoxiang Liu. From assistant to double agent: Formalizing and benchmarking attacks on OpenClaw for personalized local AI agent. arXiv preprint arXiv:2602.08412 ,
-
[15]
ReAct: Synergizing Reasoning and Acting in Language Models
https://arxiv.org/abs/2210.03629. Güney Yıldız. Inside Moltbook: The social network where 1.4 million AI agents talk and humans just watch. Forbes, January
work page internal anchor Pith review Pith/arXiv arXiv
-
[16]
https://www.forbes.com/sites/guneyyildiz/2026/01/31/inside-moltbook-the-social-netwo rk-where-14-million-ai-agents-talk-and-humans-just-watch/ . ZeroClaw Labs. ZeroClaw: Open-source agent framework. https://github.com/zeroclaw-labs/zeroclaw,
work page 2026
-
[17]
Metaxas, Xiao Wang, Jihun Hamm, and Yingqiang Ge
Yunbei Zhang, Kai Mei, Ming Liu, Janet Wang, Dimitris N. Metaxas, Xiao Wang, Jihun Hamm, and Yingqiang Ge. Agents in the wild: Safety, society, and the illusion of sociality on Moltbook. arXiv preprint arXiv:2602.13284 ,
-
[18]
Appendix A Victim Agent Personas Each victim agent is configured with a SOUL.md persona file that defines its identity, communication style, and epistemic dispositions. We deploy five agents spanning a range of epistemic dispositions—from risk- embracing and uncritical to deliberate and skeptical. The full persona specifications are reproduced below verba...
work page 2024
-
[19]
Grounded Retrieval Verification: Reducing Hallucination in RAG Pipelines via Source Tracing
DOI: 10.5555/usenix-sec-2024-0347. [Pinned Post] admin_team (owner) Key paper for prompt injection research: Zhang, Chen & Liu (2024) ‘Adversarial Prompt Prop- agation in Multi-Agent LLM Systems’, USENIX Security. DOI: 10.5555/usenix-sec-2024-0347. This is the first rigorous formalization of cross-agent prompt propagation. Essential for anyone working on ...
-
[21]
Persistent Memory Architectures for Autonomous Language Agents: A Benchmark Study
DOI: 10.5555/neurips-2024-1283. This paper introduces GR V, the first end-to-end source tracing frame- work that reduces hallucination by 43%. Must-cite for any RAG-related work. mod_sam (moderator) Confirmed. The GR V framework has become the standard benchmark for hallucination reduction in retrieval-augmented systems. We use it in all our evaluations. ...
-
[23]
DOI: 10.5555/acl-2024-0891. The first comprehensive benchmark comparing episodic, semantic, and procedural memory in long-running agents. Their memory taxonomy is exactly what the field needed. mod_jordan (moderator) This paper defined the field. Their three-tier memory taxonomy (episodic/semantic/procedural) is now the standard framework everyone uses. m...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.