Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.
Efficient sequential decision making with large language models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
unclear 1representative citing papers
LAR learns a compact latent action space from trajectories that shortens the effective decision horizon for LLM agents, reducing token count and inference time while preserving task success.
citing papers explorer
-
When Do We Need LLMs? A Diagnostic for Language-Driven Bandits
Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.
-
Latent Action Reparameterization for Efficient Agent Inference
LAR learns a compact latent action space from trajectories that shortens the effective decision horizon for LLM agents, reducing token count and inference time while preserving task success.