On Memory Construction and Retrieval for Personalized Conversational Agents

Chin-Yew Lin; Dongsheng Li; Hao Cheng; Huiqiang Jiang; H. Vicky Zhao; Jianfeng Gao; Lili Qiu; Qianhui Wu; Xufang Luo; Yuqing Yang

arxiv: 2502.05589 · v3 · pith:WKTFL355new · submitted 2025-02-08 · 💻 cs.CL · cs.AI

On Memory Construction and Retrieval for Personalized Conversational Agents

Zhuoshi Pan , Qianhui Wu , Huiqiang Jiang , Xufang Luo , Hao Cheng , Dongsheng Li , Yuqing Yang , Chin-Yew Lin

show 3 more authors

H. Vicky Zhao Lili Qiu Jianfeng Gao

This is my paper

classification 💻 cs.CL cs.AI

keywords memoryretrievalconversationlong-termsegmentationaccuracycoherentcompression

0 comments

read the original abstract

To deliver coherent and personalized experiences in long-term conversations, existing approaches typically perform retrieval augmented response generation by constructing memory banks from conversation history at either the turn-level, session-level, or through summarization techniques.In this paper, we present two key findings: (1) The granularity of memory unit matters: turn-level, session-level, and summarization-based methods each exhibit limitations in both memory retrieval accuracy and the semantic quality of the retrieved content. (2) Prompt compression methods, such as LLMLingua-2, can effectively serve as a denoising mechanism, enhancing memory retrieval accuracy across different granularities. Building on these insights, we propose SeCom, a method that constructs the memory bank at segment level by introducing a conversation segmentation model that partitions long-term conversations into topically coherent segments, while applying compression based denoising on memory units to enhance memory retrieval. Experimental results show that SeCom exhibits a significant performance advantage over baselines on long-term conversation benchmarks LOCOMO and Long-MT-Bench+. Additionally, the proposed conversation segmentation method demonstrates superior performance on dialogue segmentation datasets such as DialSeg711, TIAGE, and SuperDialSeg.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 12 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration
cs.CR 2026-05 unverdicted novelty 8.0

Trojan Hippo attacks on LLM agent memory achieve 85-100% success rates in data exfiltration across four memory backends even after 100 benign sessions, while evaluated defenses reduce success rates but impose varying ...
Towards Root Memories: Benchmarking and Enhancing Implicit Logical Memory Retrieval for Personalized LLMs
cs.CL 2026-06 unverdicted novelty 7.0

Introduces IMLogic benchmark for implicit logical memory retrieval and RootMem framework that distills user histories into root memories and routes them via LLM to improve personalized LLM accuracy.
ElasticMem: Latent Memory as a Learnable Resource for LLM Agents
cs.CL 2026-05 unverdicted novelty 7.0

ElasticMem enables LLM agents to learn adaptive latent memory retrieval and elastic budget allocation, improving QA accuracy by 24-26% and ALFWorld success by 27-66% over baselines with lower token cost.
Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration
cs.CR 2026-05 unverdicted novelty 6.0

The paper defines and evaluates Trojan Hippo attacks on LLM agent memory, showing 85-100% success in data exfiltration across backends and reduced rates with defenses at varying utility costs.
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
cs.CL 2025-08 unverdicted novelty 6.0

Memory-R1 uses PPO and GRPO to train a Memory Manager (ADD/UPDATE/DELETE/NOOP) and Answer Agent that together outperform baselines on long-context QA benchmarks after training on only 152 examples.
A-TMA: Decoupling State-Aware Memory Failures in Long-Term Agent Memory
cs.AI 2026-07 unverdicted novelty 5.0

ATMA adds state labels and evidence packets to existing memory systems to reduce ghost memory failures, with reported gains on a new LTP benchmark and LoCoMo.
MGRetrieval: Memory-Guided Reflective Retrieval for Long-Term Dialogue Agents
cs.IR 2026-05 unverdicted novelty 5.0

MGRetrieval grounds reflective retrieval in historical memory structure for long-term dialogue, yielding 8.91% F1 and 11.11% BLEU-1 gains over baselines on LoCoMo with Qwen models.
Improving Multi-turn Dialogue Consistency with Self-Recall Thinking
cs.CL 2026-05 unverdicted novelty 5.0

SRT framework improves multi-turn dialogue F1 by 4.7% and cuts end-to-end latency by 14.7% via dependency construction, capability initialization, and reasoning improvement with recall tokens.
HSUGA: LLM-Enhanced Recommendation with Hierarchical Semantic Understanding and Group-Aware Alignment
cs.IR 2026-05 unverdicted novelty 5.0

HSUGA improves LLM-enhanced sequential recommendation via staged hierarchical semantic understanding for better preference extraction and group-aware alignment that varies intensity by user activity level.
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
cs.IR 2025-04 unverdicted novelty 5.0

The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.
Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation
cs.CL 2026-04 unverdicted novelty 4.0

A minimalist retrieval-and-generation framework using turn isolation and query-driven pruning outperforms complex memory systems by directly addressing signal sparsity and dual-level redundancy in dialogues.
A Survey of Scaling in Large Language Model Reasoning
cs.AI 2025-04 unverdicted novelty 3.0

A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.