On Memory Construction and Retrieval for Personalized Conversational Agents
read the original abstract
To deliver coherent and personalized experiences in long-term conversations, existing approaches typically perform retrieval augmented response generation by constructing memory banks from conversation history at either the turn-level, session-level, or through summarization techniques.In this paper, we present two key findings: (1) The granularity of memory unit matters: turn-level, session-level, and summarization-based methods each exhibit limitations in both memory retrieval accuracy and the semantic quality of the retrieved content. (2) Prompt compression methods, such as LLMLingua-2, can effectively serve as a denoising mechanism, enhancing memory retrieval accuracy across different granularities. Building on these insights, we propose SeCom, a method that constructs the memory bank at segment level by introducing a conversation segmentation model that partitions long-term conversations into topically coherent segments, while applying compression based denoising on memory units to enhance memory retrieval. Experimental results show that SeCom exhibits a significant performance advantage over baselines on long-term conversation benchmarks LOCOMO and Long-MT-Bench+. Additionally, the proposed conversation segmentation method demonstrates superior performance on dialogue segmentation datasets such as DialSeg711, TIAGE, and SuperDialSeg.
This paper has not been read by Pith yet.
Forward citations
Cited by 12 Pith papers
-
Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration
Trojan Hippo attacks on LLM agent memory achieve 85-100% success rates in data exfiltration across four memory backends even after 100 benign sessions, while evaluated defenses reduce success rates but impose varying ...
-
Towards Root Memories: Benchmarking and Enhancing Implicit Logical Memory Retrieval for Personalized LLMs
Introduces IMLogic benchmark for implicit logical memory retrieval and RootMem framework that distills user histories into root memories and routes them via LLM to improve personalized LLM accuracy.
-
ElasticMem: Latent Memory as a Learnable Resource for LLM Agents
ElasticMem enables LLM agents to learn adaptive latent memory retrieval and elastic budget allocation, improving QA accuracy by 24-26% and ALFWorld success by 27-66% over baselines with lower token cost.
-
Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration
The paper defines and evaluates Trojan Hippo attacks on LLM agent memory, showing 85-100% success in data exfiltration across backends and reduced rates with defenses at varying utility costs.
-
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Memory-R1 uses PPO and GRPO to train a Memory Manager (ADD/UPDATE/DELETE/NOOP) and Answer Agent that together outperform baselines on long-context QA benchmarks after training on only 152 examples.
-
A-TMA: Decoupling State-Aware Memory Failures in Long-Term Agent Memory
ATMA adds state labels and evidence packets to existing memory systems to reduce ghost memory failures, with reported gains on a new LTP benchmark and LoCoMo.
-
MGRetrieval: Memory-Guided Reflective Retrieval for Long-Term Dialogue Agents
MGRetrieval grounds reflective retrieval in historical memory structure for long-term dialogue, yielding 8.91% F1 and 11.11% BLEU-1 gains over baselines on LoCoMo with Qwen models.
-
Improving Multi-turn Dialogue Consistency with Self-Recall Thinking
SRT framework improves multi-turn dialogue F1 by 4.7% and cuts end-to-end latency by 14.7% via dependency construction, capability initialization, and reasoning improvement with recall tokens.
-
HSUGA: LLM-Enhanced Recommendation with Hierarchical Semantic Understanding and Group-Aware Alignment
HSUGA improves LLM-enhanced sequential recommendation via staged hierarchical semantic understanding for better preference extraction and group-aware alignment that varies intensity by user activity level.
-
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.
-
Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation
A minimalist retrieval-and-generation framework using turn isolation and query-driven pruning outperforms complex memory systems by directly addressing signal sparsity and dual-level redundancy in dialogues.
-
A Survey of Scaling in Large Language Model Reasoning
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.