Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Calibration-gated LLM pseudo-observations reduce cumulative regret by 19% versus pure LinUCB on a 5-arm news recommendation task when using task-specific prompts, but generic prompts increase regret on both tested environments.
citing papers explorer
-
When Do We Need LLMs? A Diagnostic for Language-Driven Bandits
Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.
-
Calibration-Gated LLM Pseudo-Observations for Online Contextual Bandits
Calibration-gated LLM pseudo-observations reduce cumulative regret by 19% versus pure LinUCB on a 5-arm news recommendation task when using task-specific prompts, but generic prompts increase regret on both tested environments.