A router that decomposes uncertainty to flexibly route queries between cheap models and oracles while providing regret bounds and supporting abstention in classification tasks with multiple annotations.
Levente Kocsis and Csaba Szepesvári
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Switchcraft routes agentic tool-calling queries to the lowest-cost model that preserves correctness, reaching 82.9% accuracy and 84% cost reduction on five benchmarks.
SecureRouter accelerates secure transformer inference by 1.95x via an encrypted router that selects input-adaptive models from an MPC-optimized pool with negligible accuracy loss.
BoundaryRouter routes queries to LLM or agent using early experience memory from a seed set, cutting inference time 60.6% versus always using agents and raising performance 28.6% versus always using direct LLM inference.
Develops COF algorithm for MAB-CS that intelligently checks cheap arm feasibility by pooling samples, with generalized instance-dependent lower bounds and matching upper bounds on cumulative cost and quality regret.
PPRoute achieves plaintext-level LLM routing quality with MPC-based privacy and a 20x speedup over naive encrypted implementations via MPC-friendly encoders, multi-step training, and O(1) communication Top-k search.
SMCS coordinates 15 open-source LLMs via retrieval-based prior selection and exploration-exploitation posterior enhancement, outperforming GPT-4.1 by 5.36% and GPT-o3-mini by 5.28% on eight benchmarks.
citing papers explorer
-
Flexible Routing via Uncertainty Decomposition
A router that decomposes uncertainty to flexibly route queries between cheap models and oracles while providing regret bounds and supporting abstention in classification tasks with multiple annotations.
-
Switchcraft: AI Model Router for Agentic Tool Calling
Switchcraft routes agentic tool-calling queries to the lowest-cost model that preserves correctness, reaching 82.9% accuracy and 84% cost reduction on five benchmarks.
-
SecureRouter: Encrypted Routing for Efficient Secure Inference
SecureRouter accelerates secure transformer inference by 1.95x via an encrypted router that selects input-adaptive models from an MPC-optimized pool with negligible accuracy loss.
-
Learning Agent Routing From Early Experience
BoundaryRouter routes queries to LLM or agent using early experience memory from a seed set, cutting inference time 60.6% versus always using agents and raising performance 28.6% versus always using direct LLM inference.
-
Cost-Ordered Feasibility for Multi-Armed Bandits with Cost Subsidy
Develops COF algorithm for MAB-CS that intelligently checks cheap arm feasibility by pooling samples, with generalized instance-dependent lower bounds and matching upper bounds on cumulative cost and quality regret.
-
Privacy-Preserving LLMs Routing
PPRoute achieves plaintext-level LLM routing quality with MPC-based privacy and a 20x speedup over naive encrypted implementations via MPC-friendly encoders, multi-step training, and O(1) communication Top-k search.
-
A Scalable Multi-LLM Collaboration System with Retrieval-based Selection and Exploration-Exploitation-Driven Enhancement
SMCS coordinates 15 open-source LLMs via retrieval-based prior selection and exploration-exploitation posterior enhancement, outperforming GPT-4.1 by 5.36% and GPT-o3-mini by 5.28% on eight benchmarks.
- Latency-Quality Routing for Functionally Equivalent Tools in LLM Agents