Why Ask One When You Can Ask $k$? Learning-to-Defer to the Top-$k$ Experts

Axel Carlier; Lai Xing Ng; Wei Tsang Ooi; Yannis Montreuil

arxiv: 2504.12988 · v5 · pith:5FCNQY2Lnew · submitted 2025-04-17 · 💻 cs.LG · stat.ML

Why Ask One When You Can Ask k? Learning-to-Defer to the Top-k Experts

Yannis Montreuil , Axel Carlier , Lai Xing Ng , Wei Tsang Ooi This is my paper

classification 💻 cs.LG stat.ML

keywords top-learning-to-deferdeferralexpertsmathcalacrossconsistentexpert

0 comments

read the original abstract

Existing Learning-to-Defer (L2D) frameworks are limited to single-expert deferral, forcing each query to rely on only one expert and preventing the use of collective expertise. We introduce the first framework for Top-$k$ Learning-to-Defer, which allocates queries to the $k$ most cost-effective entities. Our formulation unifies and strictly generalizes prior approaches, including the one-stage and two-stage regimes, selective prediction, and classical cascades. In particular, it recovers the usual Top-1 deferral rule as a special case while enabling principled collaboration with multiple experts when $k>1$. We further propose Top-$k(x)$ Learning-to-Defer, an adaptive variant that learns the optimal number of experts per query based on input difficulty, expert quality, and consultation cost. To enable practical learning, we develop a novel surrogate loss that is Bayes-consistent, $\mathcal{H}_h$-consistent in the one-stage setting, and $(\mathcal{H}_r,\mathcal{H}_g)$-consistent in the two-stage setting. Crucially, this surrogate is independent of $k$, allowing a single policy to be learned once and deployed flexibly across $k$. Experiments across both regimes show that Top-$k$ and Top-$k(x)$ deliver superior accuracy-cost trade-offs, opening a new direction for multi-expert deferral in L2D.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Coherent Hierarchical Multi-Label Learning to Defer for Medical Imaging
cs.AI 2026-05 unverdicted novelty 7.0

The work defines a Selective-Exclusion handoff contract for hierarchical L2D, proves nodewise Bayes rules can be incoherent, and supplies exact dynamic-programming projection and TBP+RPO that drive incoherence to near...
Optimized Deferral for Imbalanced Settings
cs.LG 2026-04 unverdicted novelty 5.0

MILD reformulates two-stage learning to defer as cost-sensitive learning over the input-expert domain and derives new margin-based losses with guarantees, yielding better performance than baselines on image classifica...