A hierarchical framework generates statistically valid task-level rank confidence intervals via pairwise comparisons and leaderboard-level rank prediction intervals via conformal prediction.
Simultaneous confidence intervals for ranks using the partitioning principle.Electronic Journal of Statistics, 15(1): 3109–3134, 2021
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Rank Intervals for Leaderboards: A Hierarchical Framework for Model Evaluation
A hierarchical framework generates statistically valid task-level rank confidence intervals via pairwise comparisons and leaderboard-level rank prediction intervals via conformal prediction.