A unified approach to routing and cascad- ing for llms.arXiv preprint arXiv:2410.10347

[Dekonincket al · 2024 · arXiv 2410.10347

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Flexible Routing via Uncertainty Decomposition

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

A router that decomposes uncertainty to flexibly route queries between cheap models and oracles while providing regret bounds and supporting abstention in classification tasks with multiple annotations.

A Regime Theory of Controller Class Selection for LLM Action Decisions

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

A regime theory selects the optimal controller class for LLM action decisions from a nested lattice of four classes using three data-estimable bottlenecks, with a Bernstein-tight threshold and empirical matches on multiple benchmarks.

LatentRouter: Can We Choose the Right Multimodal Model Before Seeing Its Answer?

cs.AI · 2026-05-11 · unverdicted · novelty 6.0

LatentRouter routes image-question queries to the best MLLM by predicting counterfactual performance via latent communication between learned query capsules and model capability tokens.

A Communication-Theoretic Framework for LLM Agents: Cost-Aware Adaptive Reliability

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

LLM reliability techniques are unified as communication channel operators, with a new cost-aware router achieving superior quality-cost tradeoffs on hard tasks.

Harnessing Multiple Large Language Models: A Survey on LLM Ensemble

cs.CL · 2025-02-25 · unverdicted · novelty 2.0

A systematic survey of LLM ensemble methods organized into a taxonomy of ensemble-before-inference, ensemble-during-inference, and ensemble-after-inference stages, with review of benchmarks, applications, and future directions.

citing papers explorer

Showing 5 of 5 citing papers.

Flexible Routing via Uncertainty Decomposition cs.LG · 2026-05-08 · unverdicted · none · ref 3
A router that decomposes uncertainty to flexibly route queries between cheap models and oracles while providing regret bounds and supporting abstention in classification tasks with multiple annotations.
A Regime Theory of Controller Class Selection for LLM Action Decisions cs.AI · 2026-05-07 · unverdicted · none · ref 23
A regime theory selects the optimal controller class for LLM action decisions from a nested lattice of four classes using three data-estimable bottlenecks, with a Bernstein-tight threshold and empirical matches on multiple benchmarks.
LatentRouter: Can We Choose the Right Multimodal Model Before Seeing Its Answer? cs.AI · 2026-05-11 · unverdicted · none · ref 10
LatentRouter routes image-question queries to the best MLLM by predicting counterfactual performance via latent communication between learned query capsules and model capability tokens.
A Communication-Theoretic Framework for LLM Agents: Cost-Aware Adaptive Reliability cs.LG · 2026-05-09 · unverdicted · none · ref 47
LLM reliability techniques are unified as communication channel operators, with a new cost-aware router achieving superior quality-cost tradeoffs on hard tasks.
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble cs.CL · 2025-02-25 · unverdicted · none · ref 7
A systematic survey of LLM ensemble methods organized into a taxonomy of ensemble-before-inference, ensemble-during-inference, and ensemble-after-inference stages, with review of benchmarks, applications, and future directions.

A unified approach to routing and cascad- ing for llms.arXiv preprint arXiv:2410.10347

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer