Proceedings of the AAAI Conference on Artificial Intelligence , author=

PRECISE: Reducing the Bias of LLM Evaluations Using Prediction-Powered Ranking Estimation , volume= · 2026 · DOI 10.1609/aaai.v40i47.41427

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Statistically Reliable LLM-Based Ranking Evaluation via Prediction-Powered Inference

cs.LG · 2026-06-03 · unverdicted · novelty 6.0

PRECISE extends PPI to ranking evaluation metrics with a computational reduction for hierarchical metrics, showing standard error reduction on ESCI and correct identification of best system variant in production.

citing papers explorer

Showing 1 of 1 citing paper.

Statistically Reliable LLM-Based Ranking Evaluation via Prediction-Powered Inference cs.LG · 2026-06-03 · unverdicted · none · ref 17
PRECISE extends PPI to ranking evaluation metrics with a computational reduction for hierarchical metrics, showing standard error reduction on ESCI and correct identification of best system variant in production.

Proceedings of the AAAI Conference on Artificial Intelligence , author=

fields

years

verdicts

representative citing papers

citing papers explorer