, wCs)∈ RCs+1, encoding the relative value of each category

a utility weight vectorws = (w0 · 2019

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Ranking Reasoning LLMs under Test-Time Scaling

cs.LG · 2026-03-11 · accept · novelty 5.0

Many established statistical ranking techniques produce orderings of reasoning LLMs under test-time scaling that closely match a Bayesian gold standard, with mean Kendall tau_b of 0.93-0.95 at full trials and best methods reaching 0.86 at single trials.

citing papers explorer

Showing 1 of 1 citing paper.

Ranking Reasoning LLMs under Test-Time Scaling cs.LG · 2026-03-11 · accept · none · ref 17
Many established statistical ranking techniques produce orderings of reasoning LLMs under test-time scaling that closely match a Bayesian gold standard, with mean Kendall tau_b of 0.93-0.95 at full trials and best methods reaching 0.86 at single trials.

, wCs)∈ RCs+1, encoding the relative value of each category

fields

years

verdicts

representative citing papers

citing papers explorer