MEMENTO trains LLMs to segment reasoning into blocks, generate mementos as dense summaries, and reason forward using only mementos and KV states, cutting peak KV cache by ~2.5x while preserving benchmark accuracy.
>3N=\:I1RNǔJ(9s ޟz e | S \ IxrJ!ߘDjR (L<3NX V. J´C : 2x0 ;ctRK |J4; +WݟJ9ࠍT ` 1i! +ؘ T7a2owϢ)pe09 Gv q *Zj `N ô#¬D< ˘ʬI, ar5fŊspW>r , #ˬ2w N 1As Q T Kp e唡_drz f <G
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
MEMENTO: Teaching LLMs to Manage Their Own Context
MEMENTO trains LLMs to segment reasoning into blocks, generate mementos as dense summaries, and reason forward using only mementos and KV states, cutting peak KV cache by ~2.5x while preserving benchmark accuracy.