Recognition: unknown
A Survey on the Security of Long-Term Memory in LLM Agents: Toward Mnemonic Sovereignty
Pith reviewed 2026-05-10 08:49 UTC · model grok-4.3
The pith
The survey maps security vulnerabilities in LLM agent memory across write-store-retrieve-execute-share-forget phases and advocates for mnemonic sovereignty to enable verifiable control over memory operations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Three findings stand out: the literature concentrates on write- and retrieve-time integrity attacks, while confidentiality, availability, store/forget, and benign-persistence failures remain sparsely studied; no published architecture covers all nine governance primitives we identify; and using LLMs themselves for memory security remains sparse yet essential.
Load-bearing premise
The six-phase memory-lifecycle framework comprehensively captures all relevant security aspects of agent memory, and the identified literature gaps accurately reflect the state of the field without systematic search methodology details.
Figures
read the original abstract
Research on large language model (LLM) security is shifting from "will the model leak training data" to a more consequential question: can an agent with persistent, long-term memory be continuously shaped, cross-session poisoned, accessed without authorization, and propagated across shared organizational state? Recent surveys cover memory architectures and agent mechanisms, but fewer center the epistemic and governance properties of persistent, writable memory as the reason memory is an independent security problem. This survey addresses that gap. Drawing on cognitive neuroscience and the philosophy of memory, we characterize agent memory as malleable, rewritable, and socially propagating, and develop a memory-lifecycle framework organized around six phases -- Write, Store, Retrieve, Execute, Share, Forget/Rollback -- cross-tabulated against four security objectives: integrity, confidentiality, availability, governance. We organize the literature on memory poisoning, extraction, retrieval corruption, control-flow hijacking, cross-agent propagation, rollback, and governance, and situate representative architectures as determinants of which phases are explicitly governable. Three findings stand out: the literature concentrates on write- and retrieve-time integrity attacks, while confidentiality, availability, store/forget, and benign-persistence failures remain sparsely studied; no published architecture covers all nine governance primitives we identify; and using LLMs themselves for memory security remains sparse yet essential. We unify these under mnemonic sovereignty -- verifiable, recoverable governance over what may be written, who may read, when updates are authorized, and which states may be forgotten -- arguing future secure agents will be differentiated not only by recall capacity, but by memory governance quality.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper surveys security issues in long-term memory for LLM agents. It draws on cognitive neuroscience and philosophy of memory to characterize agent memory as malleable and socially propagating, proposes a six-phase lifecycle framework (Write, Store, Retrieve, Execute, Share, Forget/Rollback) cross-tabulated against four security objectives (integrity, confidentiality, availability, governance), reviews literature on attacks such as poisoning, extraction, and propagation as well as representative architectures, identifies concentrations and gaps in the literature, and introduces the concept of mnemonic sovereignty as verifiable governance over memory operations.
Significance. If the framework is comprehensive and the literature categorization representative, the survey could usefully organize research on an emerging security surface for persistent LLM agents and highlight under-studied areas such as confidentiality and governance primitives, potentially informing design of more secure agent systems differentiated by memory control quality.
major comments (2)
- [Literature review and findings sections] The central findings that the literature concentrates on write- and retrieve-time integrity attacks while confidentiality, availability, store/forget, and benign-persistence failures remain sparsely studied, and that no published architecture covers all nine governance primitives, rest on the authors' literature categorization. The manuscript supplies no search protocol, databases, keywords, inclusion/exclusion criteria, or temporal bounds, so these gap claims cannot be distinguished from under-sampling by the review.
- [Memory-lifecycle framework definition] The six-phase memory-lifecycle framework plus four objectives is used to organize the literature and to support the claim that no architecture covers all nine governance primitives. No derivation, coverage argument, or justification is provided for why these phases and objectives exhaust the relevant security surface; missing dimensions such as provenance, multi-agent consensus, or memory versioning would falsify the coverage claim.
minor comments (1)
- [Abstract] The abstract refers to 'nine governance primitives' without clarifying how this number is obtained from the 6x4 framework; a brief derivation or table mapping would improve clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. The comments identify two areas where the manuscript can be strengthened through greater methodological transparency and explicit justification of the proposed framework. We address each point below and will incorporate the suggested revisions in the next version of the manuscript.
read point-by-point responses
-
Referee: [Literature review and findings sections] The central findings that the literature concentrates on write- and retrieve-time integrity attacks while confidentiality, availability, store/forget, and benign-persistence failures remain sparsely studied, and that no published architecture covers all nine governance primitives, rest on the authors' literature categorization. The manuscript supplies no search protocol, databases, keywords, inclusion/exclusion criteria, or temporal bounds, so these gap claims cannot be distinguished from under-sampling by the review.
Authors: We agree that the absence of an explicit search protocol limits the defensibility of the gap claims. In the revised manuscript we will insert a dedicated 'Literature Review Methodology' subsection. It will specify the databases queried (arXiv, Google Scholar, ACL Anthology, IEEE Xplore, and selected security conference proceedings), the keyword strings employed (including 'LLM agent long-term memory', 'memory poisoning', 'retrieval attack LLM agent', 'agent memory governance'), the inclusion criteria (works addressing persistent, cross-session memory in LLM-based agents, published or posted 2022–2024), and the exclusion criteria (short-context-only studies, non-agent systems, purely theoretical model papers without implementation). Temporal bounds will be justified by the emergence of production-grade agent frameworks after 2022. These additions will allow readers to evaluate sampling completeness and will reinforce rather than undermine the reported concentrations and gaps. revision: yes
-
Referee: [Memory-lifecycle framework definition] The six-phase memory-lifecycle framework plus four objectives is used to organize the literature and to support the claim that no architecture covers all nine governance primitives. No derivation, coverage argument, or justification is provided for why these phases and objectives exhaust the relevant security surface; missing dimensions such as provenance, multi-agent consensus, or memory versioning would falsify the coverage claim.
Authors: The six phases are adapted from canonical cognitive-neuroscience accounts of memory (encoding, storage, retrieval, execution, social transmission, and forgetting/rollback), while the four objectives extend the CIA triad with governance to capture control and accountability requirements specific to writable agent memory. We acknowledge that the current text provides no explicit derivation or completeness argument. In revision we will expand the framework section with a new subsection that (a) maps each phase to its cognitive and agentic counterpart, (b) justifies the four objectives as the minimal set needed to cover integrity, secrecy, liveness, and authorization, and (c) addresses the cited dimensions: provenance is subsumed under write-time integrity and governance primitives; multi-agent consensus is handled within the Share and Governance phases; versioning is treated as part of Store and Forget/Rollback. We will also note any residual gaps and, if warranted, augment the nine primitives rather than assert exhaustiveness without support. This revision will preserve the framework while making its coverage claims verifiable. revision: partial
Circularity Check
No circularity: survey framework and gap analysis are externally grounded
full rationale
This literature survey proposes a six-phase memory-lifecycle framework (Write, Store, Retrieve, Execute, Share, Forget/Rollback) cross-tabulated with four security objectives (integrity, confidentiality, availability, governance) drawn from cognitive neuroscience and philosophy of memory. The claim that no architecture covers all nine governance primitives follows directly from applying this externally motivated taxonomy to reviewed works; it does not reduce to a self-definition, fitted parameter, or self-citation chain. Gap assertions about sparsely studied areas likewise rest on the literature organization rather than any internal prediction or uniqueness theorem imported from the authors' prior work. No equations, derivations, or statistical fits appear, satisfying the default expectation of no significant circularity for a non-mathematical survey.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Agent memory is malleable, rewritable, and socially propagating.
invented entities (1)
-
mnemonic sovereignty
no independent evidence
Forward citations
Cited by 2 Pith papers
-
A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework
A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.
-
Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study
The survey organizes security threats and defenses in autonomous LLM agents into four layers and identifies that risks can propagate across layers from inputs to ecosystem impacts.
Reference graph
Works this paper leans on
-
[1]
From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms.Preprints.org 202601.0618(2026). doi:10.20944/preprints202601.0618.v2 ICLR 2026 Workshop on Memory for LLM-Based Agentic Systems (MemAgents). Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. 2024. Evaluating very long-term ...
-
[2]
VerificAgent: Domain-specific memory verification for scalable oversight of aligned computer-use agents.arXiv preprint arXiv:2506.02539(2025). Helen Nissenbaum. 2004. Privacy as contextual integrity.Washington Law Review79, 1 (2004), 119–157. Eric T Olson. 2023. Personal Identity. InThe Stanford Encyclopedia of Philosophy, Edward N Zalta (Ed.). OWASP Foun...
-
[3]
Haoran Tan, Zeyu Zhang, Chen Ma, Xu Chen, Quanyu Dai, and Zhenhua Dong
Memory poisoning attack and defense on memory based LLM-agents.arXiv preprint arXiv:2601.05504(2026). Haoran Tan, Zeyu Zhang, Chen Ma, Xu Chen, Quanyu Dai, and Zhenhua Dong. 2025b. MemBench: Towards more comprehensive evaluation on the memory of LLM-based agents. InFindings of the Annual Meeting of the Association for Computational Linguistics (ACL). arXi...
-
[4]
BenchPreS: A benchmark for context-aware personalized preference selectivity of persistent-memory LLMs.arXiv preprint arXiv:2603.16557(2026). Zhongming Yu, Naicheng Yu, Hejia Zhang, et al. 2026. Multi-agent memory from a computer architecture perspective: Visions and challenges ahead.arXiv preprint arXiv:2603.10062(2026). Yosif Zaki and Denise J Cai. 2025...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.