$\delta$-mem: Efficient Online Memory for Large Language Models

· 2026 · cs.AI · arXiv 2605.12357

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose $\delta$-mem, a lightweight memory mechanism that augments a frozen full-attention backbone with a compact online state of associative memory. $\delta$-mem compresses past information into a fixed-size state matrix updated by delta-rule learning, and uses its readout to generate low-rank corrections to the backbone's attention computation during generation. With only an $8\times8$ online memory state, $\delta$-mem improves the average score to $1.10\times$ that of the frozen backbone and $1.15\times$ that of the strongest non-$\delta$-mem memory baseline. It achieves larger gains on memory-heavy benchmarks, reaching $1.31\times$ on MemoryAgentBench and $1.20\times$ on LoCoMo, while largely preserving general capabilities. These results show that effective memory can be realized through a compact online state directly coupled with attention computation, without full fine-tuning, backbone replacement, or explicit context extension.

representative citing papers

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

cs.LG · 2026-06-01 · unverdicted · novelty 5.0

PEFT adapters are positioned as persistent personal state on foundation models, organized via Scale Up, Scale Down, and Scale Out axes, with MinT as an infrastructure example for managing them.

citing papers explorer

Showing 1 of 1 citing paper.

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters cs.LG · 2026-06-01 · unverdicted · none · ref 16 · internal anchor
PEFT adapters are positioned as persistent personal state on foundation models, organized via Scale Up, Scale Down, and Scale Out axes, with MinT as an infrastructure example for managing them.

$\delta$-mem: Efficient Online Memory for Large Language Models

fields

years

verdicts

representative citing papers

citing papers explorer