RuDE predicts post-training performance of base LLMs with over 90% correlation by using response discrimination on rubric-violating contrastive pairs, validated by RL to identify high-potential smaller models.
- SATISFIED_RUBRICS: These are grading responses where the current criteria_met status is correct and should remain unchanged after optimization
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
On Predicting the Post-training Potential of Pre-trained LLMs
RuDE predicts post-training performance of base LLMs with over 90% correlation by using response discrimination on rubric-violating contrastive pairs, validated by RL to identify high-potential smaller models.