SeqRoute applies offline RL with CQL and Hindsight Budget Relabeling to sequential LLM routing under global budgets, claiming 6.0-73.5% cost reduction, maintained or improved quality, and under 1% bankruptcy rate.
CoRR , volume =
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
R³AG routes queries to retrievers by decomposing capabilities into retrieval quality and generation utility, trained via contrastive learning on document assessments and downstream answer correctness to outperform static methods.
citing papers explorer
-
SeqRoute: Global Budget-Aware Sequential LLM Routing via Offline Reinforcement Learning
SeqRoute applies offline RL with CQL and Hindsight Budget Relabeling to sequential LLM routing under global budgets, claiming 6.0-73.5% cost reduction, maintained or improved quality, and under 1% bankruptcy rate.
-
R$^3$AG: Retriever Routing for Retrieval-Augmented Generation
R³AG routes queries to retrievers by decomposing capabilities into retrieval quality and generation utility, trained via contrastive learning on document assessments and downstream answer correctness to outperform static methods.