pith. sign in

Thompson Sampling contextual bandit over heterogeneous tools (PubMed, drug DBs, calculator, web) with composite reward including latency

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

Latency-Quality Routing for Functionally Equivalent Tools in LLM Agents

cs.LG · 2026-05-14 · unverdicted · novelty 6.0

LQM-ContextRoute routes tool calls by expected quality per service cycle using contextual bandits and LLM-as-judge feedback, yielding +2.18 pp F1, up to +18 pp accuracy, and +2.91-3.22 pp NDCG gains over SW-UCB on web-search, StrategyQA, and retriever benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

  • Latency-Quality Routing for Functionally Equivalent Tools in LLM Agents cs.LG · 2026-05-14 · unverdicted · none · ref 5

    LQM-ContextRoute routes tool calls by expected quality per service cycle using contextual bandits and LLM-as-judge feedback, yielding +2.18 pp F1, up to +18 pp accuracy, and +2.91-3.22 pp NDCG gains over SW-UCB on web-search, StrategyQA, and retriever benchmarks.