Doubly robust estimators that incorporate low-rank predictions enable valid finite-sample confidence intervals for best-model identification under adaptive sampling and without-replacement example selection in LLM evaluation.
Online multi-armed bandits with adaptive inference.Advances in Neural Information Processing Systems, 34:1939–1951
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1roles
method 1polarities
use method 1representative citing papers
citing papers explorer
-
Valid Best-Model Identification for LLM Evaluation via Low-Rank Factorization
Doubly robust estimators that incorporate low-rank predictions enable valid finite-sample confidence intervals for best-model identification under adaptive sampling and without-replacement example selection in LLM evaluation.