Agent-X achieves 1.61x end-to-end speedup for on-device LLM agents via prompt rewriting for prefix caching and LLM-free speculative decoding, with no accuracy loss on representative workloads.
Gulavani, Alexey Tumanov, and Ramachandran Ramjee
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
Agent-X: Full Pipeline Acceleration of On-device AI Agents
Agent-X achieves 1.61x end-to-end speedup for on-device LLM agents via prompt rewriting for prefix caching and LLM-free speculative decoding, with no accuracy loss on representative workloads.