pith. sign in

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it
abstract

Long-context Large Language Models, despite their expanded capacity, require careful working memory management to mitigate attention dilution during long-horizon tasks. Yet existing approaches rely on external mechanisms that lack awareness of the agent's reasoning state, leading to suboptimal decisions. We propose Memory-as-Action (MemAct), a framework that treats working memory management as learnable policy actions. By formulating context management as in-place editing operations (deletion, insertion), MemAct enables joint optimization of information retention and task performance through end-to-end reinforcement learning. To address the computational challenges of dynamic context updates, we introduce Dynamic Context Policy Optimization, which restores training efficiency without compromising reasoning integrity. Experiments show that MemAct-RL-14B matches the accuracy of models $16\times$ larger while reducing average context length by 51\%, with learned strategies that adapt to model capabilities and generalize across task complexities.

citation-role summary

background 1 baseline 1

citation-polarity summary

years

2026 7 2025 1

representative citing papers

citing papers explorer

Showing 8 of 8 citing papers.