hub

LightMem: Lightweight and Efficient Memory-Augmented Generation

Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu · 2025 · cs.CL · arXiv 2510.18866

22 Pith papers cite this work. Polarity classification is still indexing.

22 Pith papers citing it

open full Pith review browse 22 citing papers arXiv PDF

abstract

Despite their remarkable capabilities, Large Language Models (LLMs) struggle to effectively leverage historical interaction information in dynamic and complex environments. Memory systems enable LLMs to move beyond stateless interactions by introducing persistent information storage, retrieval, and utilization mechanisms. However, existing memory systems often introduce substantial time and computational overhead. To this end, we introduce a new memory system called LightMem, which strikes a balance between the performance and efficiency of memory systems. Inspired by the Atkinson-Shiffrin model of human memory, LightMem organizes memory into three complementary stages. First, cognition-inspired sensory memory rapidly filters irrelevant information through lightweight compression and groups information according to their topics. Next, topic-aware short-term memory consolidates these topic-based groups, organizing and summarizing content for more structured access. Finally, long-term memory with sleep-time update employs an offline procedure that decouples consolidation from online inference. On LongMemEval and LoCoMo, using GPT and Qwen backbones, LightMem consistently surpasses strong baselines, improving QA accuracy by up to 7.7% / 29.3%, reducing total token usage by up to 38x / 20.9x and API calls by up to 30x / 55.5x, while purely online test-time costs are even lower, achieving up to 106x / 117x token reduction and 159x / 310x fewer API calls. The code is available at https://github.com/zjunlp/LightMem.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 baseline 2

citation-polarity summary

background 2 baseline 2

representative citing papers

MedMemoryBench: Benchmarking Agent Memory in Personalized Healthcare

cs.AI · 2026-05-12 · conditional · novelty 8.0

MedMemoryBench supplies a 2,000-session synthetic medical trajectory dataset and an evaluate-while-constructing streaming protocol to expose memory saturation and reasoning failures in current agent architectures for personalized healthcare.

AlpsBench: An LLM Personalization Benchmark for Real-Dialogue Memorization and Preference Alignment

cs.CL · 2026-03-09 · unverdicted · novelty 8.0

AlpsBench supplies 2500 real-dialogue sequences with verified memories to benchmark LLM extraction, updating, retrieval, and utilization of personalized information.

MemGym: a Long-Horizon Memory Environment for LLM Agents

cs.CL · 2026-05-20 · unverdicted · novelty 7.0

MemGym unifies agent gyms into a memory benchmark with isolated scoring across tool-use, research, coding, and computer-use regimes plus a lightweight reward model for tractable coding evaluation.

EXG: Self-Evolving Agents with Experience Graphs

cs.AI · 2026-05-18 · unverdicted · novelty 7.0

EXG is an experience graph framework for self-evolving LLM agents that supports online real-time growth and offline reuse to enhance solution quality and efficiency on code generation and reasoning benchmarks.

When Stored Evidence Stops Being Usable: Scale-Conditioned Evaluation of Agent Memory

cs.AI · 2026-05-08 · unverdicted · novelty 7.0

A new evaluation protocol shows agent memory reliability degrades variably with added irrelevant sessions depending on agent, memory interface, and scale.

Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory

cs.CL · 2026-05-01 · unverdicted · novelty 7.0

MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.

PERMA: Benchmarking Personalized Memory Agents via Event-Driven Preference and Realistic Task Environments

cs.AI · 2026-03-24 · unverdicted · novelty 7.0

PERMA is a new benchmark using temporally ordered events, text variability, and linguistic alignment to evaluate LLM memory agents on persona consistency beyond simple retrieval.

Self-Evolving Multi-Agent Systems via Decentralized Memory

cs.MA · 2026-05-21 · unverdicted · novelty 6.0

DecentMem is a decentralized dual-pool memory framework for self-evolving multi-agent systems that provides O(log T) regret guarantees and yields up to 23.8% accuracy gains over centralized baselines.

Auto-Dreamer: Learning Offline Memory Consolidation for Language Agents

cs.CL · 2026-05-20 · unverdicted · novelty 6.0

Auto-Dreamer trains an offline memory consolidator via GRPO on agent performance to abstract cross-session patterns, outperforming baselines by 7 points on ScienceWorld with 12x smaller memory and generalizing to ALFWorld and WebArena.

PRISM: Pareto-Efficient Retrieval over Intent-Aware Structured Memory for Long-Horizon Agents

cs.CL · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

PRISM is a new inference-time retrieval system that achieves higher accuracy than baselines on long-horizon agent tasks while using an order of magnitude less context by combining hierarchical graph search, intent-based costing, compression, and adaptive routing over structured memory.

From History to State: Constant-Context Skill Learning for LLM Agents

cs.AI · 2026-05-06 · unverdicted · novelty 6.0

Constant-context skill learning trains reusable task-family modules for LLM agents using a deterministic state block for progress tracking and subgoal rewards, achieving 89.6% unseen success on ALFWorld, 76.8% on WebShop, and 66.4% on SciWorld with Qwen3-8B while reducing prompt tokens 2-7x.

Experience Compression Spectrum: Unifying Memory, Skills, and Rules in LLM Agents

cs.AI · 2026-04-17 · conditional · novelty 6.0

The Experience Compression Spectrum unifies memory, skills, and rules in LLM agents along increasing compression levels and identifies the absence of adaptive cross-level compression as the missing diagonal.

PRISM-MCTS: Learning from Reasoning Trajectories with Metacognitive Reflection

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

PRISM-MCTS improves MCTS-based reasoning efficiency by maintaining a shared memory of heuristics and fallacies reinforced by a process reward model, halving required trajectories on GPQA while outperforming prior methods.

HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling

cs.AI · 2026-02-15 · unverdicted · novelty 6.0

HyMem introduces dual-granular memory storage with a lightweight summary module for fast responses and selective activation of a deep LLM module for complex queries, outperforming full-context baselines by 92.6% lower computational cost on LOCOMO and LongMemEval benchmarks.

Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

cs.AI · 2025-11-26 · unverdicted · novelty 6.0

ViLoMem is a dual-stream grow-and-refine memory system that separates visual and logical error patterns in MLLMs to improve pass@1 accuracy and reduce repeated mistakes across six multimodal benchmarks.

HyperMem: Hypergraph Memory for Long-Term Conversations

cs.CL · 2026-04-09 · unverdicted · novelty 5.0

HyperMem is a hypergraph memory architecture that groups related conversation episodes and facts via hyperedges and reports 92.73% LLM-as-a-judge accuracy on the LoCoMo benchmark.

MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

cs.AI · 2026-01-29 · unverdicted · novelty 5.0

MemOCR renders structured memory as images with adaptive visual density to improve long-horizon reasoning under tight context budgets.

Memory as Metabolism: A Design for Companion Knowledge Systems

cs.AI · 2026-04-13 · unverdicted · novelty 4.0

This paper designs a companion knowledge system with TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, and AUDIT operations plus memory gravity and minority-hypothesis retention to give contradictory evidence a path to update dominant interpretations in personal LLM wikis.

Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation

cs.CL · 2026-04-13 · unverdicted · novelty 4.0

A minimalist retrieval-and-generation framework using turn isolation and query-driven pruning outperforms complex memory systems by directly addressing signal sparsity and dual-level redundancy in dialogues.

Agentic Reasoning for Large Language Models

cs.AI · 2026-01-18 · unverdicted · novelty 4.0

The survey structures agentic reasoning for LLMs into foundational, self-evolving, and collective multi-agent layers while distinguishing in-context orchestration from post-training optimization and reviewing applications across domains.

EvoIR-Agent: Self-Evolving Image Restoration Agentic System via Experience-Driven Learning

cs.CV · 2026-05-21

DimMem: Dimensional Structuring for Efficient Long-Term Agent Memory

cs.CL · 2026-05-15

citing papers explorer

Showing 22 of 22 citing papers.

MedMemoryBench: Benchmarking Agent Memory in Personalized Healthcare cs.AI · 2026-05-12 · conditional · none · ref 8 · internal anchor
MedMemoryBench supplies a 2,000-session synthetic medical trajectory dataset and an evaluate-while-constructing streaming protocol to expose memory saturation and reasoning failures in current agent architectures for personalized healthcare.
AlpsBench: An LLM Personalization Benchmark for Real-Dialogue Memorization and Preference Alignment cs.CL · 2026-03-09 · unverdicted · none · ref 7 · internal anchor
AlpsBench supplies 2500 real-dialogue sequences with verified memories to benchmark LLM extraction, updating, retrieval, and utilization of personalized information.
MemGym: a Long-Horizon Memory Environment for LLM Agents cs.CL · 2026-05-20 · unverdicted · none · ref 11 · internal anchor
MemGym unifies agent gyms into a memory benchmark with isolated scoring across tool-use, research, coding, and computer-use regimes plus a lightweight reward model for tractable coding evaluation.
EXG: Self-Evolving Agents with Experience Graphs cs.AI · 2026-05-18 · unverdicted · none · ref 6 · internal anchor
EXG is an experience graph framework for self-evolving LLM agents that supports online real-time growth and offline reuse to enhance solution quality and efficiency on code generation and reasoning benchmarks.
When Stored Evidence Stops Being Usable: Scale-Conditioned Evaluation of Agent Memory cs.AI · 2026-05-08 · unverdicted · none · ref 44 · internal anchor
A new evaluation protocol shows agent memory reliability degrades variably with added irrelevant sessions depending on agent, memory interface, and scale.
Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory cs.CL · 2026-05-01 · unverdicted · none · ref 124 · internal anchor
MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.
PERMA: Benchmarking Personalized Memory Agents via Event-Driven Preference and Realistic Task Environments cs.AI · 2026-03-24 · unverdicted · none · ref 10 · internal anchor
PERMA is a new benchmark using temporally ordered events, text variability, and linguistic alignment to evaluate LLM memory agents on persona consistency beyond simple retrieval.
Self-Evolving Multi-Agent Systems via Decentralized Memory cs.MA · 2026-05-21 · unverdicted · none · ref 21 · internal anchor
DecentMem is a decentralized dual-pool memory framework for self-evolving multi-agent systems that provides O(log T) regret guarantees and yields up to 23.8% accuracy gains over centralized baselines.
Auto-Dreamer: Learning Offline Memory Consolidation for Language Agents cs.CL · 2026-05-20 · unverdicted · none · ref 6 · internal anchor
Auto-Dreamer trains an offline memory consolidator via GRPO on agent performance to abstract cross-session patterns, outperforming baselines by 7 points on ScienceWorld with 12x smaller memory and generalizing to ALFWorld and WebArena.
PRISM: Pareto-Efficient Retrieval over Intent-Aware Structured Memory for Long-Horizon Agents cs.CL · 2026-05-12 · unverdicted · none · ref 5 · 2 links · internal anchor
PRISM is a new inference-time retrieval system that achieves higher accuracy than baselines on long-horizon agent tasks while using an order of magnitude less context by combining hierarchical graph search, intent-based costing, compression, and adaptive routing over structured memory.
From History to State: Constant-Context Skill Learning for LLM Agents cs.AI · 2026-05-06 · unverdicted · none · ref 3 · internal anchor
Constant-context skill learning trains reusable task-family modules for LLM agents using a deterministic state block for progress tracking and subgoal rewards, achieving 89.6% unseen success on ALFWorld, 76.8% on WebShop, and 66.4% on SciWorld with Qwen3-8B while reducing prompt tokens 2-7x.
Experience Compression Spectrum: Unifying Memory, Skills, and Rules in LLM Agents cs.AI · 2026-04-17 · conditional · none · ref 6 · internal anchor
The Experience Compression Spectrum unifies memory, skills, and rules in LLM agents along increasing compression levels and identifies the absence of adaptive cross-level compression as the missing diagonal.
PRISM-MCTS: Learning from Reasoning Trajectories with Metacognitive Reflection cs.AI · 2026-04-07 · unverdicted · none · ref 1 · internal anchor
PRISM-MCTS improves MCTS-based reasoning efficiency by maintaining a shared memory of heuristics and fallacies reinforced by a process reward model, halving required trajectories on GPQA while outperforming prior methods.
HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling cs.AI · 2026-02-15 · unverdicted · none · ref 18 · internal anchor
HyMem introduces dual-granular memory storage with a lightweight summary module for fast responses and selective activation of a deep LLM module for complex queries, outperforming full-context baselines by 92.6% lower computational cost on LOCOMO and LongMemEval benchmarks.
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory cs.AI · 2025-11-26 · unverdicted · none · ref 12 · internal anchor
ViLoMem is a dual-stream grow-and-refine memory system that separates visual and logical error patterns in MLLMs to improve pass@1 accuracy and reduce repeated mistakes across six multimodal benchmarks.
HyperMem: Hypergraph Memory for Long-Term Conversations cs.CL · 2026-04-09 · unverdicted · none · ref 2 · internal anchor
HyperMem is a hypergraph memory architecture that groups related conversation episodes and facts via hyperedges and reports 92.73% LLM-as-a-judge accuracy on the LoCoMo benchmark.
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning cs.AI · 2026-01-29 · unverdicted · none · ref 9 · internal anchor
MemOCR renders structured memory as images with adaptive visual density to improve long-horizon reasoning under tight context budgets.
Memory as Metabolism: A Design for Companion Knowledge Systems cs.AI · 2026-04-13 · unverdicted · none · ref 8 · internal anchor
This paper designs a companion knowledge system with TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, and AUDIT operations plus memory gravity and minority-hypothesis retention to give contradictory evidence a path to update dominant interpretations in personal LLM wikis.
Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation cs.CL · 2026-04-13 · unverdicted · none · ref 5 · internal anchor
A minimalist retrieval-and-generation framework using turn isolation and query-driven pruning outperforms complex memory systems by directly addressing signal sparsity and dual-level redundancy in dialogues.
Agentic Reasoning for Large Language Models cs.AI · 2026-01-18 · unverdicted · none · ref 300 · internal anchor
The survey structures agentic reasoning for LLMs into foundational, self-evolving, and collective multi-agent layers while distinguishing in-context orchestration from post-training optimization and reviewing applications across domains.
EvoIR-Agent: Self-Evolving Image Restoration Agentic System via Experience-Driven Learning cs.CV · 2026-05-21 · unreviewed · ref 24 · internal anchor
DimMem: Dimensional Structuring for Efficient Long-Term Agent Memory cs.CL · 2026-05-15 · unreviewed · ref 6 · internal anchor

LightMem: Lightweight and Efficient Memory-Augmented Generation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer