Proxy metrics from next-token distributions over expert solutions outperform loss and compute baselines for ranking LLMs, selecting pretraining data, and extrapolating performance across compute scales.
Kim and Samuel Miserendino and Gildas Chabot and David Li and Patrick Chao and Michael Sharman and Alexandra Barr and Amelia Glaese and Jerry Tworek , booktitle=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Forecasting Downstream Performance of LLMs With Proxy Metrics
Proxy metrics from next-token distributions over expert solutions outperform loss and compute baselines for ranking LLMs, selecting pretraining data, and extrapolating performance across compute scales.