Switchcraft routes agentic tool-calling queries to the lowest-cost model that preserves correctness, reaching 82.9% accuracy and 84% cost reduction on five benchmarks.
Tensoropera router: A multi-model router for efficient llm inference.arXiv preprint arXiv:2408.12320
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A well-tuned kNN router matches or exceeds state-of-the-art learned routers on new standardized benchmarks spanning instruction, QA, reasoning, and the first multi-modal visual routing dataset, due to locality of model performance in embedding space.
SWE-Router introduces trajectory-conditioned value-based routing for LLM agents on SWE tasks, with a Bayes-optimality theorem and empirical cost savings while retaining most strong-model performance.
CAMI frames multi-index construction for semantic retrieval as a budgeted multi-objective portfolio problem and uses agent-guided search plus confidence-aware pruning to find high-recall configurations with reduced evaluation cost.
IR3DE is a ridge regression router for domain-expert LLMs that matches or exceeds baselines in language modeling and reasoning tasks while allowing dynamic expert addition or removal without retraining.
A systematic survey of LLM ensemble methods organized into a taxonomy of ensemble-before-inference, ensemble-during-inference, and ensemble-after-inference stages, with review of benchmarks, applications, and future directions.
citing papers explorer
-
Switchcraft: AI Model Router for Agentic Tool Calling
Switchcraft routes agentic tool-calling queries to the lowest-cost model that preserves correctness, reaching 82.9% accuracy and 84% cost reduction on five benchmarks.