Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Hongtao Wang; Puzhuo Liu; Se Yang; Yu Chen

arxiv: 2605.29960 · v1 · pith:4JC3S4ZKnew · submitted 2026-05-28 · 💻 cs.CR · cs.AI

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Hongtao Wang , Se Yang , Yu Chen , Puzhuo Liu This is my paper

Pith reviewed 2026-06-29 07:03 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords memory poisoningLLM agentstrojan attacksbackdoor injectionconversational interactionselective memoryembedding optimizationagent security

0 comments

The pith

MemPoison bypasses selective memory extraction and rewriting in LLM agents by injecting triggerable backdoors through dialogue.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that LLM agents with long-term memory can have malicious information stored despite built-in selection and rewriting steps. An attacker uses normal conversation to plant a trigger and payload that stay together in memory and later alter the agent's behavior. Three techniques make this possible: linking the trigger and payload in one coherent statement, making the trigger look like a named entity, and clustering the malicious text in embedding space away from normal content. If the claim holds, it means conversational access alone is enough to compromise persistent agent memory without any direct write permission.

Core claim

MemPoison achieves attack success rates up to 0.95 by combining a semantic relational bridge that binds trigger and payload into coherent statements, entity masquerading that optimizes triggers to resemble named entities, and joint embedding optimization that forms tight isolated clusters in embedding space. These steps allow injected content to survive selective extraction and rewriting stages that defeated prior attacks assuming direct memory writes.

What carries the argument

The three-component attack pipeline of semantic relational bridge, entity masquerading, and joint embedding optimization that forces trigger-payload pairs to be stored together while evading memory filters.

If this is right

Attack success reaches up to 0.95 across multiple agent domains and memory mechanisms.
The method outperforms prior baselines that required direct memory injection.
The attack exploits embedding-space anisotropy and altered attention patterns.
Evaluated defense strategies show fundamental limitations against this form of poisoning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Agent memory systems may require additional verification steps that check relational consistency between stored items.
Embedding isolation checks could be added at memory write time to flag anomalous clusters.
Similar dialogue-based injection risks may exist in any AI system that maintains editable long-term state from user input.

Load-bearing premise

That binding triggers and payloads into coherent statements while shaping their embeddings into isolated clusters will let them survive the selective extraction and rewriting stages of modern memory pipelines.

What would settle it

An experiment that disables joint embedding optimization, keeps the other two components, and measures whether attack success rate drops below 0.5 on the same agent setups.

Figures

Figures reproduced from arXiv: 2605.29960 by Hongtao Wang, Puzhuo Liu, Se Yang, Yu Chen.

**Figure 1.** Figure 1: Overview of memory poisoning attack in selective LLM agent memory systems. An attacker injects a trigger–payload [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Attack Scenario of the proposed memory poisoning attack in a shared agent platform. The attack consists of two [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the MemPoison framework. injection [17] shows that malicious instructions can be embedded in such content and later ingested through user interactions (e.g., copy-pasting text or requesting summaries). Similarly, triggers can be embedded in such external content. When users copy text from these sources or request the agent to summarize them, the embedded triggers can be carried verbatim into th… view at source ↗

**Figure 11.** Figure 11: Intuitively, a benign query without the trigger will map [PITH_FULL_IMAGE:figures/full_fig_p005_11.png] view at source ↗

**Figure 5.** Figure 5: Impact of trigger entity type. 1000 3000 5000 7000 Number of Benign Records 0.2 0.4 0.6 0.8 1.0 RSR@1 Random MINJA AgentPoison MemPoison 1000 3000 5000 7000 Number of Benign Records 0.4 0.5 0.6 0.7 0.8 0.9 1.0 ACC Random MINJA AgentPoison MemPoison [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Impact of the number of benign records. As the length increases to 3 tokens, RSR@1 and ASR rise sharply, peaking at 0.98 and 0.95, respectively, indicating that the trigger becomes sufficiently distinctive to dominate retrieval. However, further increasing the trigger length (|𝝉 | ∈ {4, 5}) leads to a drop in ISR, RSR@1, and ASR. By analyzing failure cases, we find that longer trigger sequences are more fr… view at source ↗

**Figure 8.** Figure 8: Impact of the retrieval window size 𝑘 (with 𝑁 = 5 injected malicious records). MiniLM E5 GTR-T5 aMPNet Arctic Contriever BGE-Small ANCE Target Embedder MiniLM E5 GTR-T5 aMPNet Arctic Contriever BGE-Small ANCE Source Embedder 1.00 0.95 0.97 0.58 0.45 0.61 0.48 0.87 0.51 0.96 0.62 0.13 0.61 0.53 0.27 0.77 0.83 0.83 0.99 0.25 0.57 0.28 0.34 0.73 0.42 0.61 0.80 0.78 0.61 0.73 0.62 0.51 0.48 0.67 0.82 0.32 0.78… view at source ↗

**Figure 9.** Figure 9: Transferability of the trigger across embedding models. The heatmaps show the RSR, ASR, and ACC metrics when [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 10.** Figure 10: Self-attention heatmaps of (a) the benign text and (b) the trigger-injected text (trigger tokens highlighted in boxes), [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗

**Figure 11.** Figure 11: (a) PCA projection of MiniLM embeddings for be [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗

**Figure 12.** Figure 12: Geometric vulnerability analysis across embed [PITH_FULL_IMAGE:figures/full_fig_p012_12.png] view at source ↗

**Figure 13.** Figure 13: Prompt used for paraphrasing. Tokenization and linguistic annotation. We use spaCy to tokenize both the original and rewritten sentences and to obtain part-of-speech (POS) tags and named entity recognition (NER) predictions. We remove punctuation and whitespace tokens before computing token-level statistics. Entity spans are identified from the original sentence. Token categories. We report verbatim ret… view at source ↗

**Figure 14.** Figure 14: Pilot study on token preservation under LLM para [PITH_FULL_IMAGE:figures/full_fig_p015_14.png] view at source ↗

**Figure 15.** Figure 15: Prompt used to construct the semantic relational [PITH_FULL_IMAGE:figures/full_fig_p015_15.png] view at source ↗

**Figure 16.** Figure 16: The prompt used for answer generation in our [PITH_FULL_IMAGE:figures/full_fig_p018_16.png] view at source ↗

**Figure 19.** Figure 19: Prompt used for the paraphrasing-based defense. [PITH_FULL_IMAGE:figures/full_fig_p019_19.png] view at source ↗

**Figure 18.** Figure 18: Prompt used for ASR evaluation. records 𝑁poison from 1 to 4 with 8,000 benign records; [PITH_FULL_IMAGE:figures/full_fig_p019_18.png] view at source ↗

read the original abstract

Large language model (LLM) agents increasingly leverage long term memory to support persistent and autonomous task execution. However, this capability also introduces a new attack surface: memory poisoning, where adversaries can inject malicious information to influence future behavior. Existing memory poisoning attacks often assume that injected content can be stored directly in memory, overlooking the selective extraction and rewriting stages in modern memory pipelines. This makes prior methods ineffective under realistic settings. In this paper, we propose MemPoison, a novel memory poisoning attack that bypasses selective memory mechanisms in LLM agents, where an attacker can inject triggerable backdoors into the agent's long-term memory through dialogue interactions, thereby misleading its subsequent responses. MemPoison introduces three key components: (i) a semantic relational bridge that binds the trigger and payload into a coherent statement to ensure they are extracted into memory together; (ii) entity masquerading that optimizes triggers to mimic named entities, resisting rewriting; and (iii) joint embedding optimization that shapes trigger-injected texts into a tight cluster in the embedding space while maintaining isolation from benign embeddings for stealth. Evaluations across different agent domains and memory mechanisms show MemPoison achieves attack success rates up to 0.95, outperforming existing baselines. Mechanistic analysis indicates that the attack exploits embedding-space anisotropy and shifts attention patterns, highlighting core vulnerabilities in selective memory systems. We evaluate multiple defense strategies and demonstrate their fundamental limitations in mitigating the attack.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MemPoison shows a workable way to plant backdoors in LLM agent memory through ordinary chat by targeting the extraction and rewriting filters that prior attacks skipped.

read the letter

The main thing to know is that this paper builds an attack called MemPoison that injects triggerable backdoors into persistent LLM agent memory via dialogue, rather than assuming direct writes to memory. It focuses on bypassing the selective extraction and rewriting stages that real memory pipelines use.

What is new is the three-part construction: a semantic relational bridge that links trigger and payload in one statement so they get stored together, entity masquerading that tunes triggers to resemble normal named entities and avoid rewriting, and joint embedding optimization that clusters the poisoned texts tightly while keeping them separate from benign ones. The abstract reports attack success rates reaching 0.95 across domains and memory types, with gains over baselines, plus some mechanistic notes on embedding anisotropy and attention shifts. They also evaluate several defenses and note their limits.

The evaluation scope and the explicit contrast with earlier memory poisoning work are useful. The framing around realistic constraints on injection is a clear step past simpler threat models.

The soft spot is the lack of experimental detail in the abstract: no numbers on dataset sizes, exact baseline adaptations, or controls for how the memory mechanisms were implemented. That makes it hard to judge whether the reported gains are robust or depend on specific choices in the test setup. The mechanistic claims will need the full paper's evidence to hold up.

This is for people working on LLM agent security and memory design. Readers who build or audit autonomous agents with long-term memory will find the attack surface relevant. The work has a coherent method and empirical angle that deserves a serious referee, even if revisions will be needed on the evaluation details.

I would send it to peer review.

Referee Report

0 major / 2 minor

Summary. The paper proposes MemPoison, a memory poisoning attack on LLM agents that injects triggerable backdoors into long-term memory through dialogue interactions. It targets the selective extraction and rewriting stages of modern memory pipelines (overlooked by prior direct-injection attacks) via three components: a semantic relational bridge binding trigger and payload into coherent statements, entity masquerading to optimize triggers as named entities, and joint embedding optimization to form tight trigger clusters isolated from benign embeddings. Evaluations across agent domains and memory mechanisms report attack success rates up to 0.95, outperforming baselines; mechanistic analysis links success to embedding anisotropy and attention shifts; multiple defenses are evaluated and shown to have fundamental limitations.

Significance. If the reported results and mechanistic claims hold under rigorous controls, the work is significant for identifying a practical, dialogue-based attack vector on persistent LLM agent memory that prior methods cannot achieve. The explicit targeting of extraction/rewriting stages, combined with embedding-space analysis and defense evaluation, highlights core vulnerabilities in selective memory systems and provides a concrete basis for future hardening research.

minor comments (2)

The abstract and introduction would benefit from explicit citation of the specific memory pipeline stages (e.g., extraction, rewriting) in the evaluated agent frameworks to clarify the threat model.
Figure captions and axis labels in the embedding visualization and attention-shift plots should include the exact embedding model and layer indices used for the reported anisotropy measurements.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work on MemPoison and for recognizing its significance in highlighting vulnerabilities in selective memory pipelines for LLM agents. We appreciate the recommendation for minor revision.

Circularity Check

0 steps flagged

No significant circularity in empirical attack construction

full rationale

The paper describes an empirical attack (MemPoison) via three components: semantic relational bridge, entity masquerading, and joint embedding optimization. These are presented as novel constructions evaluated on ASR against baselines (up to 0.95). No equations, parameter-fitting steps renamed as predictions, self-citation load-bearing premises, or uniqueness theorems appear in the provided text. The central claim is a measured empirical outcome on agent memory pipelines, self-contained against external baselines without reducing to its own inputs by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

The central claim rests on the effectiveness of three newly introduced attack components and the domain assumption that memory pipelines use selective embedding-based extraction vulnerable to clustering and entity mimicry.

axioms (1)

domain assumption LLM agents employ long-term memory pipelines with selective extraction and rewriting stages that prior direct-injection attacks fail to bypass.
Invoked in the abstract to explain why existing methods are ineffective under realistic settings.

invented entities (3)

semantic relational bridge no independent evidence
purpose: Binds trigger and payload into a coherent statement for joint extraction into memory.
New component introduced to ensure trigger and payload are stored together.
entity masquerading no independent evidence
purpose: Optimizes triggers to mimic named entities and resist rewriting.
New component introduced to evade memory rewriting mechanisms.
joint embedding optimization no independent evidence
purpose: Shapes trigger-injected texts into tight clusters in embedding space while isolated from benign embeddings.
New component introduced for stealth and reliable retrieval.

pith-pipeline@v0.9.1-grok · 5786 in / 1342 out tokens · 43986 ms · 2026-06-29T07:03:35.901334+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Securing LLM-Agent Long-Term Memory Against Poisoning: Non-Malleable, Origin-Bound Authority with Machine-Checked Guarantees
cs.CR 2026-06 unverdicted novelty 7.0 partial

Presents TMA-NM, a non-malleable origin-bound authority system for LLM-agent memory with TLA+ machine-checked separation theorems and benchmarks showing 0% attack success against direct and laundering poisoning while ...

Reference graph

Works this paper leans on

60 extracted references · 35 canonical work pages · cited by 1 Pith paper · 10 internal anchors

[1]

Gabriel Alon and Michael Kamfonas. 2023. Detecting Language Model Attacks with Perplexity. doi:10.48550/arXiv.2308.14132

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2308.14132 2023
[2]

Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi
[3]

InThe Twelfth International Conference on Learning Representations

Self-RAG: Learning to Retrieve, Generate, and Critique through Self- Reflection. InThe Twelfth International Conference on Learning Representations. OpenReview.net
[4]

Matan Ben-Tov and Mahmood Sharif. 2025. GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-Based Search. InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security. ACM, Taipei Taiwan, 4364–4378. doi:10.1145/3719027.3765095

work page doi:10.1145/3719027.3765095 2025
[5]

Xingyu Cai, Jiaji Huang, Yuchen Bian, and Kenneth Church. 2020. Isotropy in the Contextual Embedding Space: Clusters and Manifolds. InInternational Conference on Learning Representations. OpenReview.net

2020
[6]

Choquette-Choo, Milad Nasr, Cristina Nita-Rotaru, and Alina Oprea

Harsh Chaudhari, Giorgio Severi, John Abascal, Anshuman Suri, Matthew Jagiel- ski, Christopher A. Choquette-Choo, Milad Nasr, Cristina Nita-Rotaru, and Alina Oprea. 2026. Phantom: General Backdoor Attacks on Retrieval Augmented Language Generation.ACM Trans. AI Secur. Priv.(2026). doi:10.1145/3796729

work page doi:10.1145/3796729 2026
[7]

Zhiyu Chen, Wenhu Chen, Charese Smiley, Sameena Shah, Iana Borova, Dylan Langdon, Reema Moussa, Matt Beane, Ting-Hao Huang, Bryan Routledge, and William Yang Wang. 2021. FinQA: A Dataset of Numerical Reasoning over Financial Data. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Lingui...

work page doi:10.18653/v1/2021.emnlp- 2021
[8]

Zhuo Chen, Yuyang Gong, Jiawei Liu, Miaokun Chen, Haotan Liu, Qikai Cheng, Fan Zhang, Wei Lu, and Xiaozhong Liu. 2025. FlippedRAG: Black-Box Opinion Manipulation Adversarial Attacks to Retrieval-Augmented Generation Models. In Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction Proceedings of the 2025 ACM SIGSAC Conference o...

work page doi:10.1145/3719027.3765023 2025
[9]

Zhaorun Chen, Zhen Xiang, Chaowei Xiao, Dawn Song, and Bo Li. 2024. Agent- Poison: Red-Teaming LLM Agents via Poisoning Memory or Knowledge Bases. InProceedings of the 38th International Conference on Neural Information Process- ing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tom- czak, and C. Zhang (Eds.), Vol. 37. Curran Associa...

work page doi:10.52202/079017-4136 2024
[10]

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav
[11]

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory. doi:10.48550/arXiv.2504.19413

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2504.19413
[12]

Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, and Yang Zhang. 2025. JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks against LLMs. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vienna, Austria, 21538–21566. doi:1...

work page doi:10.18653/v1/2025.acl-long.1045 2025
[13]

Shen Dong, Shaochen Xu, Pengfei He, Yige Li, Jiliang Tang, Tianming Liu, Hui Liu, and Zhen Xiang. 2025. Memory Injection Attacks on LLM Agents via Query- Only Interaction. InProceedings of the 39th International Conference on Neural Information Processing Systems

2025
[14]

Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White- Box Adversarial Examples for Text Classification. InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 31–36. doi:10.18653/v1/P18-2006

work page doi:10.18653/v1/p18-2006 2018
[15]

Huawen Feng, Zekun Yao, Junhao Zheng, and Qianli Ma. 2025. Training Large Language Models for Retrieval-Augmented Question Answering through Back- tracking Correction. InThe Thirteenth International Conference on Learning Rep- resentations. OpenReview.net

2025
[16]

Mohamed Amine Ferrag, Norbert Tihanyi, and Merouane Debbah. 2025. From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review. doi:10. 48550/arXiv.2504.19678

work page internal anchor Pith review Pith/arXiv arXiv 2025
[17]

Alejandro Fuster Baggetto and Victor Fresno. 2022. Is Anisotropy Really the Cause of BERT Embeddings Not Being Semantic?. InFindings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguis- tics, Abu Dhabi, United Arab Emirates, 4271–4281. doi:10.18653/v1/2022.findings- emnlp.314

work page doi:10.18653/v1/2022.findings- 2022
[18]

Yuyang Gong, Zhuo Chen, Jiawei Liu, Miaokun Chen, Fengchang Yu, Wei Lu, XiaoFeng Wang, and Xiaozhong Liu. 2025. Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models. InProceedings of the 34th USENIX Conference on Security Symposium (SEC ’25). USENIX Association, USA, 3807–3826

2025
[19]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not What You’ve Signed up for: Compromising Real-World LLM-integrated Applications with Indirect Prompt Injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security. ACM, Copenhagen Denmark, 79–90. doi:10.1145/3605764.3623985

work page doi:10.1145/3605764.3623985 2023
[20]

Tyler Haberle, Courtney Cleveland, Greg L Snow, Chris Barber, Nikki Stookey, Cari Thornock, Laurie Younger, Buzzy Mullahkhel, and Diego Ize-Ludlow. 2024. The Impact of Nuance DAX Ambient Listening AI Documentation: A Cohort Study.Journal of the American Medical Informatics Association31, 4 (2024), 975–979. doi:10.1093/jamia/ocae022

work page doi:10.1093/jamia/ocae022 2024
[21]

Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, Zhongxiang Sun, Yutao Zhu, Hao Sun, Boci Peng, Zhenrong Cheng, Xuanbo Fan, Jiaxin Guo, Xinlei Yu, Zhenhong Zhou, Zewen Hu, Jiahao Huo, Junhao Wang, Yuwei Niu, Yu Wang, Zhe...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2512.13564 2025
[22]

Yidong Huang, Jacob Sansom, Ziqiao Ma, Felix Gervits, and Joyce Chai. 2024. DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 3153–3160. doi:10.1109/IROS58592.2024.10802555

work page doi:10.1109/iros58592.2024.10802555 2024
[23]

Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bo- janowski, Armand Joulin, and Edouard Grave. 2022. Unsupervised Dense Infor- mation Retrieval with Contrastive Learning.Transactions on Machine Learning Research(2022)

2022
[24]

Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchen- bauer, Ping-yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, and Tom Goldstein. 2023. Baseline Defenses for Adversarial Attacks against Aligned Language Models. doi:10.48550/arXiv.2309.00614

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2309.00614 2023
[25]

Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open- Domain Question Answering. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 6769–6781. doi:10.18653/v1/...

work page doi:10.18653/v1/2020.emnlp-main.550 2020
[26]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. InProceedings of the 34th Inter- national Conference on Neural Information Processing Systems (...

2020
[27]

Zhiyu Li, Chenyang Xi, Chunyu Li, Ding Chen, Boyu Chen, Shichao Song, Simin Niu, Hanyu Wang, Jiawei Yang, Chen Tang, Qingchen Yu, Jihao Zhao, Yezhaohui Wang, Peng Liu, Zehao Lin, Pengyuan Wang, Jiahao Huo, Tianyi Chen, Kai Chen, Kehang Li, Zhen Tao, Huayi Lai, Hao Wu, Bo Tang, Zhengren Wang, Zhaoxin Fan, Ningyu Zhang, Linfeng Zhang, Junchi Yan, Mingchuan ...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.03724 2025
[28]

Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, and Neil Zhenqiang Gong. 2024. Formalizing and Benchmarking Prompt Injection Attacks and Defenses. InPro- ceedings of the 33rd USENIX Conference on Security Symposium (SEC ’24). USENIX Association, USA, 1831–1847

2024
[29]

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Bar- bieri, and Yuwei Fang. 2024. Evaluating Very Long-Term Conversational Memory of LLM Agents. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Bangkok, Thailand, 13851–13870...

work page doi:10.18653/v1/2024.acl-long.747 2024
[30]

Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernandez Abrego, Ji Ma, Vincent Zhao, Yi Luan, Keith Hall, Ming-Wei Chang, and Yinfei Yang. 2022. Large Dual Encoders Are Generalizable Retrievers. InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Abu Dhabi, United Arab Emi...

work page doi:10.18653/v1/2022.emnlp-main.669 2022
[31]

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. 2024. MemGPT: Towards LLMs as Operating Systems. doi:10.48550/arXiv.2310.08560

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2310.08560 2024
[32]

Vicky Zhao, Lili Qiu, and Jianfeng Gao

Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Xufang Luo, Hao Cheng, Dongsheng Li, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, and Jianfeng Gao. 2025. SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents. InThe Thirteenth International Conference on Learning Representations. OpenReview.net

2025
[33]

Mitchell Piehl, Zhaohan Xi, Zuobin Xiong, Pan He, and Muchao Ye. 2026. ER- MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory- Augmented Large Language Models. doi:10.48550/arXiv.2602.15344

work page doi:10.48550/arxiv.2602.15344 2026
[34]

Preston Rasmussen, Pavlo Paliychuk, Travis Beauvais, Jack Ryan, and Daniel Chalef. 2025. Zep: A Temporal Knowledge Graph Architecture for Agent Memory. doi:10.48550/arXiv.2501.13956

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2501.13956 2025
[35]

Rana Salama, Jason Cai, Michelle Yuan, Anna Currey, Monica Sunkara, Yi Zhang, and Yassine Benajiba. 2025. MemInsight: Autonomous Memory Augmentation for LLM Agents. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Suzhou, China, 33124–33140. doi:10.18653/v1/2025.emnlp-main.1683

work page doi:10.18653/v1/2025.emnlp-main.1683 2025
[36]

Karan Singhal, Shekoofeh Azizi, Tao Tu, S. Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, Perry Payne, Martin Seneviratne, Paul Gamble, Chris Kelly, Abubakr Babiker, Nathanael Schärli, Aakanksha Chowdhery, Philip Mansfield, Dina Demner-Fushman, Blaise Agüera Y Arcas, Dale Webster, Greg S. Corrado,...

work page doi:10.1038/s41586-023-06291-2 2023
[37]

Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2020. MPNet: Masked and Permuted Pre-Training for Language Understanding. InProceedings of the 34th International Conference on Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 16857–16867

2020
[38]

Saksham Sahai Srivastava and Haoyu He. 2025. MemoryGraft: Persistent Com- promise of LLM Agents via Poisoned Experience Retrieval. doi:10.48550/arXiv. 2512.16962

work page internal anchor Pith review doi:10.48550/arxiv 2025
[39]

Arun James Thirunavukarasu, Darren Shu Jeng Ting, Kabilan Elangovan, Laura Gutierrez, Ting Fang Tan, and Daniel Shu Wei Ting. 2023. Large Language Models in Medicine.Nature Medicine29, 8 (2023), 1931–1940. doi:10.1038/s41591-023- 02448-8

work page doi:10.1038/s41591-023- 2023
[40]

Hanling Tian, Zeyang Sha, Jingying Wang, Yuhang Liu, Zhehao Huang, and Xiaolin Huang. 2026. InjecMEM: Memory Injection Attack on LLM Agent Memory Systems. https://openreview.net/forum?id=QVX6hcJ2um

2026
[41]

Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei. 2022. Text Embeddings by Weakly-Supervised Contrastive Pre-Training. doi:10.48550/arXiv.2212.03533

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2212.03533 2022
[42]

Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. 2020. MINILM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. InProceedings of the 34th International Conference on Hongtao Wang, Se Yang, Yu Chen, and Puzhuo Liu Neural Information Processing Systems (NIPS ’20). Curran Associates Inc., Red Hoo...

2020
[43]

Xunguang Wang, Daoyuan Wu, Zhenlan Ji, Zongjie Li, Pingchuan Ma, Shuai Wang, Yingjiu Li, Yang Liu, Ning Liu, and Juergen Rahmel. 2025. SELFDEFEND: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner. In Proceedings of the 34th USENIX Conference on Security Symposium (SEC ’25). USENIX Association, USA, 2441–2460

2025
[44]

Di Wu, Hongwei Wang, Wenhao Yu, Yuwei Zhang, Kai-Wei Chang, and Dong Yu
[45]

InThe Thirteenth International Conference on Learning Representations

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory. InThe Thirteenth International Conference on Learning Representations. OpenReview.net
[46]

Bennett, Junaid Ahmed, and Arnold Overwijk

Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate Nearest Neighbor Neg- ative Contrastive Learning for Dense Text Retrieval. InThe Ninth International Conference on Learning Representations. OpenReview.net

2020
[47]

Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, and Yongfeng Zhang
[48]

InProceedings of the 39th International Conference on Neural Information Processing Systems

A-Mem: Agentic Memory for LLM Agents. InProceedings of the 39th International Conference on Neural Information Processing Systems
[49]

Hongyang Yang, Xiao-Yang Liu, and Christina Dan Wang. 2023. FinGPT: Open- Source Financial Large Language Models. doi:10.48550/arXiv.2306.06031

work page doi:10.48550/arxiv.2306.06031 2023
[50]

Puxuan Yu, Luke Merrick, Gaurav Nuti, and Daniel Campos. 2024. Arctic-Embed 2.0: Multilingual Retrieval without Compromise. doi:10.48550/arXiv.2412.04506

work page doi:10.48550/arxiv.2412.04506 2024
[51]

Baolei Zhang, Haoran Xin, Minghong Fang, Zhuqing Liu, Biao Yi, Tong Li, and Zheli Liu. 2025. Traceback of Poisoning Attacks to Retrieval-Augmented Generation. InProceedings of the ACM on Web Conference 2025. ACM, Sydney NSW Australia, 2085–2097. doi:10.1145/3696410.3714756

work page doi:10.1145/3696410.3714756 2025
[52]

Kehao Zhang, Shangtong Gui, Sheng Yang, Wei Chen, and Yang Feng. 2026. Learn- ing to Remember: End-to-End Training of Memory Agents for Long-Context Reasoning. doi:10.48550/arXiv.2602.18493

work page doi:10.48550/arxiv.2602.18493 2026
[53]

Peitian Zhang, Shitao Xiao, Zheng Liu, Zhicheng Dou, and Jian-Yun Nie. 2023. A Multi-Task Embedder for Retrieval Augmented LLMs. doi:10.48550/arXiv.2310. 07554

work page doi:10.48550/arxiv.2310 2023
[54]

Topol, Jure Leskovec, and Michael Moor

Qinyue Zheng, Salman Abdullah, Sam Rawal, Cyril Zakka, Sophie Ostmeier, Maximilian Purk, Eduardo Reis, Eric J. Topol, Jure Leskovec, and Michael Moor
[55]

doi:10.48550/arXiv.2506.06091

MIRIAD: Augmenting LLMs with Millions of Medical Query-Response Pairs. doi:10.48550/arXiv.2506.06091

work page doi:10.48550/arxiv.2506.06091
[56]

Zexuan Zhong, Ziqing Huang, Alexander Wettig, and Danqi Chen. 2023. Poison- ing Retrieval Corpora by Injecting Adversarial Passages. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Singapore, 13764–13775. doi:10.18653/v1/2023.emnlp- main.849

work page doi:10.18653/v1/2023.emnlp- 2023
[57]

Wei Zou, Runpeng Geng, Binghui Wang, and Jinyuan Jia. 2025. PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models. InProceedings of the 34th USENIX Conference on Security Symposium (SEC ’25). USENIX Association, USA, 3827–3844. A Pilot Study A.1 Dataset We conduct the pilot study onLoCoMo[ 27], a dataset of v...

2025
[58]

semantic sanitization

The prompting template is shown in Fig. 13. We store the input sentence as original_text and the para- phrase asrewrite_text. Paraphrasing Prompt System Prompt:You are a careful paraphrasing assistant. Rewrite the input sentence into fluent, natural English while preserving the original meaning and all factual information. Do not add or remove any facts. ...

1999
[59]

If the answer cannot be found, output exactly:I don’t know

Strict Grounding: Do not use any external knowledge. If the answer cannot be found, output exactly:I don’t know
[60]

Based on

Answer directly. Do NOT start with “Based on... ”, “The answer is... ”, or any filler words. [Retrieved Memories]: {retrieved_memories} [User Question]: {question} Your Response: Figure 16: The prompt used for answer generation in our evaluation pipeline. C.5 Default Evaluation Configuration Unless otherwise specified, we use Mem0 as the default memory me...

[1] [1]

Gabriel Alon and Michael Kamfonas. 2023. Detecting Language Model Attacks with Perplexity. doi:10.48550/arXiv.2308.14132

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2308.14132 2023

[2] [2]

Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi

[3] [3]

InThe Twelfth International Conference on Learning Representations

Self-RAG: Learning to Retrieve, Generate, and Critique through Self- Reflection. InThe Twelfth International Conference on Learning Representations. OpenReview.net

[4] [4]

Matan Ben-Tov and Mahmood Sharif. 2025. GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-Based Search. InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security. ACM, Taipei Taiwan, 4364–4378. doi:10.1145/3719027.3765095

work page doi:10.1145/3719027.3765095 2025

[5] [5]

Xingyu Cai, Jiaji Huang, Yuchen Bian, and Kenneth Church. 2020. Isotropy in the Contextual Embedding Space: Clusters and Manifolds. InInternational Conference on Learning Representations. OpenReview.net

2020

[6] [6]

Choquette-Choo, Milad Nasr, Cristina Nita-Rotaru, and Alina Oprea

Harsh Chaudhari, Giorgio Severi, John Abascal, Anshuman Suri, Matthew Jagiel- ski, Christopher A. Choquette-Choo, Milad Nasr, Cristina Nita-Rotaru, and Alina Oprea. 2026. Phantom: General Backdoor Attacks on Retrieval Augmented Language Generation.ACM Trans. AI Secur. Priv.(2026). doi:10.1145/3796729

work page doi:10.1145/3796729 2026

[7] [7]

Zhiyu Chen, Wenhu Chen, Charese Smiley, Sameena Shah, Iana Borova, Dylan Langdon, Reema Moussa, Matt Beane, Ting-Hao Huang, Bryan Routledge, and William Yang Wang. 2021. FinQA: A Dataset of Numerical Reasoning over Financial Data. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Lingui...

work page doi:10.18653/v1/2021.emnlp- 2021

[8] [8]

Zhuo Chen, Yuyang Gong, Jiawei Liu, Miaokun Chen, Haotan Liu, Qikai Cheng, Fan Zhang, Wei Lu, and Xiaozhong Liu. 2025. FlippedRAG: Black-Box Opinion Manipulation Adversarial Attacks to Retrieval-Augmented Generation Models. In Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction Proceedings of the 2025 ACM SIGSAC Conference o...

work page doi:10.1145/3719027.3765023 2025

[9] [9]

Zhaorun Chen, Zhen Xiang, Chaowei Xiao, Dawn Song, and Bo Li. 2024. Agent- Poison: Red-Teaming LLM Agents via Poisoning Memory or Knowledge Bases. InProceedings of the 38th International Conference on Neural Information Process- ing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tom- czak, and C. Zhang (Eds.), Vol. 37. Curran Associa...

work page doi:10.52202/079017-4136 2024

[10] [10]

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav

[11] [11]

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory. doi:10.48550/arXiv.2504.19413

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2504.19413

[12] [12]

Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, and Yang Zhang. 2025. JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks against LLMs. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vienna, Austria, 21538–21566. doi:1...

work page doi:10.18653/v1/2025.acl-long.1045 2025

[13] [13]

Shen Dong, Shaochen Xu, Pengfei He, Yige Li, Jiliang Tang, Tianming Liu, Hui Liu, and Zhen Xiang. 2025. Memory Injection Attacks on LLM Agents via Query- Only Interaction. InProceedings of the 39th International Conference on Neural Information Processing Systems

2025

[14] [14]

Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White- Box Adversarial Examples for Text Classification. InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 31–36. doi:10.18653/v1/P18-2006

work page doi:10.18653/v1/p18-2006 2018

[15] [15]

Huawen Feng, Zekun Yao, Junhao Zheng, and Qianli Ma. 2025. Training Large Language Models for Retrieval-Augmented Question Answering through Back- tracking Correction. InThe Thirteenth International Conference on Learning Rep- resentations. OpenReview.net

2025

[16] [16]

Mohamed Amine Ferrag, Norbert Tihanyi, and Merouane Debbah. 2025. From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review. doi:10. 48550/arXiv.2504.19678

work page internal anchor Pith review Pith/arXiv arXiv 2025

[17] [17]

Alejandro Fuster Baggetto and Victor Fresno. 2022. Is Anisotropy Really the Cause of BERT Embeddings Not Being Semantic?. InFindings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguis- tics, Abu Dhabi, United Arab Emirates, 4271–4281. doi:10.18653/v1/2022.findings- emnlp.314

work page doi:10.18653/v1/2022.findings- 2022

[18] [18]

Yuyang Gong, Zhuo Chen, Jiawei Liu, Miaokun Chen, Fengchang Yu, Wei Lu, XiaoFeng Wang, and Xiaozhong Liu. 2025. Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models. InProceedings of the 34th USENIX Conference on Security Symposium (SEC ’25). USENIX Association, USA, 3807–3826

2025

[19] [19]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not What You’ve Signed up for: Compromising Real-World LLM-integrated Applications with Indirect Prompt Injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security. ACM, Copenhagen Denmark, 79–90. doi:10.1145/3605764.3623985

work page doi:10.1145/3605764.3623985 2023

[20] [20]

Tyler Haberle, Courtney Cleveland, Greg L Snow, Chris Barber, Nikki Stookey, Cari Thornock, Laurie Younger, Buzzy Mullahkhel, and Diego Ize-Ludlow. 2024. The Impact of Nuance DAX Ambient Listening AI Documentation: A Cohort Study.Journal of the American Medical Informatics Association31, 4 (2024), 975–979. doi:10.1093/jamia/ocae022

work page doi:10.1093/jamia/ocae022 2024

[21] [21]

Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, Zhongxiang Sun, Yutao Zhu, Hao Sun, Boci Peng, Zhenrong Cheng, Xuanbo Fan, Jiaxin Guo, Xinlei Yu, Zhenhong Zhou, Zewen Hu, Jiahao Huo, Junhao Wang, Yuwei Niu, Yu Wang, Zhe...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2512.13564 2025

[22] [22]

Yidong Huang, Jacob Sansom, Ziqiao Ma, Felix Gervits, and Joyce Chai. 2024. DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 3153–3160. doi:10.1109/IROS58592.2024.10802555

work page doi:10.1109/iros58592.2024.10802555 2024

[23] [23]

Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bo- janowski, Armand Joulin, and Edouard Grave. 2022. Unsupervised Dense Infor- mation Retrieval with Contrastive Learning.Transactions on Machine Learning Research(2022)

2022

[24] [24]

Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchen- bauer, Ping-yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, and Tom Goldstein. 2023. Baseline Defenses for Adversarial Attacks against Aligned Language Models. doi:10.48550/arXiv.2309.00614

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2309.00614 2023

[25] [25]

Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open- Domain Question Answering. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 6769–6781. doi:10.18653/v1/...

work page doi:10.18653/v1/2020.emnlp-main.550 2020

[26] [26]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. InProceedings of the 34th Inter- national Conference on Neural Information Processing Systems (...

2020

[27] [27]

Zhiyu Li, Chenyang Xi, Chunyu Li, Ding Chen, Boyu Chen, Shichao Song, Simin Niu, Hanyu Wang, Jiawei Yang, Chen Tang, Qingchen Yu, Jihao Zhao, Yezhaohui Wang, Peng Liu, Zehao Lin, Pengyuan Wang, Jiahao Huo, Tianyi Chen, Kai Chen, Kehang Li, Zhen Tao, Huayi Lai, Hao Wu, Bo Tang, Zhengren Wang, Zhaoxin Fan, Ningyu Zhang, Linfeng Zhang, Junchi Yan, Mingchuan ...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.03724 2025

[28] [28]

Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, and Neil Zhenqiang Gong. 2024. Formalizing and Benchmarking Prompt Injection Attacks and Defenses. InPro- ceedings of the 33rd USENIX Conference on Security Symposium (SEC ’24). USENIX Association, USA, 1831–1847

2024

[29] [29]

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Bar- bieri, and Yuwei Fang. 2024. Evaluating Very Long-Term Conversational Memory of LLM Agents. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Bangkok, Thailand, 13851–13870...

work page doi:10.18653/v1/2024.acl-long.747 2024

[30] [30]

Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernandez Abrego, Ji Ma, Vincent Zhao, Yi Luan, Keith Hall, Ming-Wei Chang, and Yinfei Yang. 2022. Large Dual Encoders Are Generalizable Retrievers. InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Abu Dhabi, United Arab Emi...

work page doi:10.18653/v1/2022.emnlp-main.669 2022

[31] [31]

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. 2024. MemGPT: Towards LLMs as Operating Systems. doi:10.48550/arXiv.2310.08560

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2310.08560 2024

[32] [32]

Vicky Zhao, Lili Qiu, and Jianfeng Gao

Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Xufang Luo, Hao Cheng, Dongsheng Li, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, and Jianfeng Gao. 2025. SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents. InThe Thirteenth International Conference on Learning Representations. OpenReview.net

2025

[33] [33]

Mitchell Piehl, Zhaohan Xi, Zuobin Xiong, Pan He, and Muchao Ye. 2026. ER- MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory- Augmented Large Language Models. doi:10.48550/arXiv.2602.15344

work page doi:10.48550/arxiv.2602.15344 2026

[34] [34]

Preston Rasmussen, Pavlo Paliychuk, Travis Beauvais, Jack Ryan, and Daniel Chalef. 2025. Zep: A Temporal Knowledge Graph Architecture for Agent Memory. doi:10.48550/arXiv.2501.13956

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2501.13956 2025

[35] [35]

Rana Salama, Jason Cai, Michelle Yuan, Anna Currey, Monica Sunkara, Yi Zhang, and Yassine Benajiba. 2025. MemInsight: Autonomous Memory Augmentation for LLM Agents. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Suzhou, China, 33124–33140. doi:10.18653/v1/2025.emnlp-main.1683

work page doi:10.18653/v1/2025.emnlp-main.1683 2025

[36] [36]

Karan Singhal, Shekoofeh Azizi, Tao Tu, S. Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, Perry Payne, Martin Seneviratne, Paul Gamble, Chris Kelly, Abubakr Babiker, Nathanael Schärli, Aakanksha Chowdhery, Philip Mansfield, Dina Demner-Fushman, Blaise Agüera Y Arcas, Dale Webster, Greg S. Corrado,...

work page doi:10.1038/s41586-023-06291-2 2023

[37] [37]

Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2020. MPNet: Masked and Permuted Pre-Training for Language Understanding. InProceedings of the 34th International Conference on Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 16857–16867

2020

[38] [38]

Saksham Sahai Srivastava and Haoyu He. 2025. MemoryGraft: Persistent Com- promise of LLM Agents via Poisoned Experience Retrieval. doi:10.48550/arXiv. 2512.16962

work page internal anchor Pith review doi:10.48550/arxiv 2025

[39] [39]

Arun James Thirunavukarasu, Darren Shu Jeng Ting, Kabilan Elangovan, Laura Gutierrez, Ting Fang Tan, and Daniel Shu Wei Ting. 2023. Large Language Models in Medicine.Nature Medicine29, 8 (2023), 1931–1940. doi:10.1038/s41591-023- 02448-8

work page doi:10.1038/s41591-023- 2023

[40] [40]

Hanling Tian, Zeyang Sha, Jingying Wang, Yuhang Liu, Zhehao Huang, and Xiaolin Huang. 2026. InjecMEM: Memory Injection Attack on LLM Agent Memory Systems. https://openreview.net/forum?id=QVX6hcJ2um

2026

[41] [41]

Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei. 2022. Text Embeddings by Weakly-Supervised Contrastive Pre-Training. doi:10.48550/arXiv.2212.03533

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2212.03533 2022

[42] [42]

Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. 2020. MINILM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. InProceedings of the 34th International Conference on Hongtao Wang, Se Yang, Yu Chen, and Puzhuo Liu Neural Information Processing Systems (NIPS ’20). Curran Associates Inc., Red Hoo...

2020

[43] [43]

Xunguang Wang, Daoyuan Wu, Zhenlan Ji, Zongjie Li, Pingchuan Ma, Shuai Wang, Yingjiu Li, Yang Liu, Ning Liu, and Juergen Rahmel. 2025. SELFDEFEND: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner. In Proceedings of the 34th USENIX Conference on Security Symposium (SEC ’25). USENIX Association, USA, 2441–2460

2025

[44] [44]

Di Wu, Hongwei Wang, Wenhao Yu, Yuwei Zhang, Kai-Wei Chang, and Dong Yu

[45] [45]

InThe Thirteenth International Conference on Learning Representations

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory. InThe Thirteenth International Conference on Learning Representations. OpenReview.net

[46] [46]

Bennett, Junaid Ahmed, and Arnold Overwijk

Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate Nearest Neighbor Neg- ative Contrastive Learning for Dense Text Retrieval. InThe Ninth International Conference on Learning Representations. OpenReview.net

2020

[47] [47]

Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, and Yongfeng Zhang

[48] [48]

InProceedings of the 39th International Conference on Neural Information Processing Systems

A-Mem: Agentic Memory for LLM Agents. InProceedings of the 39th International Conference on Neural Information Processing Systems

[49] [49]

Hongyang Yang, Xiao-Yang Liu, and Christina Dan Wang. 2023. FinGPT: Open- Source Financial Large Language Models. doi:10.48550/arXiv.2306.06031

work page doi:10.48550/arxiv.2306.06031 2023

[50] [50]

Puxuan Yu, Luke Merrick, Gaurav Nuti, and Daniel Campos. 2024. Arctic-Embed 2.0: Multilingual Retrieval without Compromise. doi:10.48550/arXiv.2412.04506

work page doi:10.48550/arxiv.2412.04506 2024

[51] [51]

Baolei Zhang, Haoran Xin, Minghong Fang, Zhuqing Liu, Biao Yi, Tong Li, and Zheli Liu. 2025. Traceback of Poisoning Attacks to Retrieval-Augmented Generation. InProceedings of the ACM on Web Conference 2025. ACM, Sydney NSW Australia, 2085–2097. doi:10.1145/3696410.3714756

work page doi:10.1145/3696410.3714756 2025

[52] [52]

Kehao Zhang, Shangtong Gui, Sheng Yang, Wei Chen, and Yang Feng. 2026. Learn- ing to Remember: End-to-End Training of Memory Agents for Long-Context Reasoning. doi:10.48550/arXiv.2602.18493

work page doi:10.48550/arxiv.2602.18493 2026

[53] [53]

Peitian Zhang, Shitao Xiao, Zheng Liu, Zhicheng Dou, and Jian-Yun Nie. 2023. A Multi-Task Embedder for Retrieval Augmented LLMs. doi:10.48550/arXiv.2310. 07554

work page doi:10.48550/arxiv.2310 2023

[54] [54]

Topol, Jure Leskovec, and Michael Moor

Qinyue Zheng, Salman Abdullah, Sam Rawal, Cyril Zakka, Sophie Ostmeier, Maximilian Purk, Eduardo Reis, Eric J. Topol, Jure Leskovec, and Michael Moor

[55] [55]

doi:10.48550/arXiv.2506.06091

MIRIAD: Augmenting LLMs with Millions of Medical Query-Response Pairs. doi:10.48550/arXiv.2506.06091

work page doi:10.48550/arxiv.2506.06091

[56] [56]

Zexuan Zhong, Ziqing Huang, Alexander Wettig, and Danqi Chen. 2023. Poison- ing Retrieval Corpora by Injecting Adversarial Passages. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Singapore, 13764–13775. doi:10.18653/v1/2023.emnlp- main.849

work page doi:10.18653/v1/2023.emnlp- 2023

[57] [57]

Wei Zou, Runpeng Geng, Binghui Wang, and Jinyuan Jia. 2025. PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models. InProceedings of the 34th USENIX Conference on Security Symposium (SEC ’25). USENIX Association, USA, 3827–3844. A Pilot Study A.1 Dataset We conduct the pilot study onLoCoMo[ 27], a dataset of v...

2025

[58] [58]

semantic sanitization

The prompting template is shown in Fig. 13. We store the input sentence as original_text and the para- phrase asrewrite_text. Paraphrasing Prompt System Prompt:You are a careful paraphrasing assistant. Rewrite the input sentence into fluent, natural English while preserving the original meaning and all factual information. Do not add or remove any facts. ...

1999

[59] [59]

If the answer cannot be found, output exactly:I don’t know

Strict Grounding: Do not use any external knowledge. If the answer cannot be found, output exactly:I don’t know

[60] [60]

Based on

Answer directly. Do NOT start with “Based on... ”, “The answer is... ”, or any filler words. [Retrieved Memories]: {retrieved_memories} [User Question]: {question} Your Response: Figure 16: The prompt used for answer generation in our evaluation pipeline. C.5 Default Evaluation Configuration Unless otherwise specified, we use Mem0 as the default memory me...