PEEK maintains a constant-sized context map via a programmable cache policy to give LLM agents persistent orientation knowledge about recurring external contexts, yielding 6-34% gains and lower cost than prior prompt-learning methods.
Prompt cache: Modular attention reuse for low-latency inference.Proceedings of Machine Learning and Systems, 6:325–338, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
LAR learns a compact latent action space from trajectories that shortens the effective decision horizon for LLM agents, reducing token count and inference time while preserving task success.
citing papers explorer
-
PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents
PEEK maintains a constant-sized context map via a programmable cache policy to give LLM agents persistent orientation knowledge about recurring external contexts, yielding 6-34% gains and lower cost than prior prompt-learning methods.
-
Latent Action Reparameterization for Efficient Agent Inference
LAR learns a compact latent action space from trajectories that shortens the effective decision horizon for LLM agents, reducing token count and inference time while preserving task success.