Mem-π is a framework using a dedicated model and decision-content decoupled RL to generate context-specific guidance on demand for LLM agents, outperforming retrieval baselines by over 30% on web navigation.
MemoryLLM: Plug-n- play interpretable feed-forward memory for transformers
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
TIDE augments standard transformers with per-layer token embedding injection via an ensemble of memory blocks and a depth-conditioned router to mitigate rare-token undertraining and contextual collapse.
Graph Memory Transformer (GMT) swaps dense FFN sublayers for a graph of 128 centroids and a learned 128x128 transition matrix per block, yielding a 82M-parameter decoder-only LM that trains stably but trails a 103M dense baseline in perplexity.
citing papers explorer
-
Mem-$\pi$: Adaptive Memory through Learning When and What to Generate
Mem-π is a framework using a dedicated model and decision-content decoupled RL to generate context-specific guidance on demand for LLM agents, outperforming retrieval baselines by over 30% on web navigation.
-
TIDE: Every Layer Knows the Token Beneath the Context
TIDE augments standard transformers with per-layer token embedding injection via an ensemble of memory blocks and a depth-conditioned router to mitigate rare-token undertraining and contextual collapse.
-
Graph Memory Transformer (GMT)
Graph Memory Transformer (GMT) swaps dense FFN sublayers for a graph of 128 centroids and a learned 128x128 transition matrix per block, yielding a 82M-parameter decoder-only LM that trains stably but trails a 103M dense baseline in perplexity.