RLVR training for language models exhibits an unlearnability phenomenon where certain hard examples stay unlearnable due to low gradient similarity and ungeneralizable reasoning patterns.
arXiv preprint arXiv:2512.01775 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 2roles
baseline 1polarities
baseline 1representative citing papers
Recovering an orthogonal basis from model activations yields a model-native skill characterization that improves reasoning Pass@1 by up to 41% via targeted data selection and supports inference steering, outperforming human-characterized alternatives.
citing papers explorer
-
The Unlearnability Phenomenon in RLVR for Language Models
RLVR training for language models exhibits an unlearnability phenomenon where certain hard examples stay unlearnable due to low gradient similarity and ungeneralizable reasoning patterns.
-
Characterizing Model-Native Skills
Recovering an orthogonal basis from model activations yields a model-native skill characterization that improves reasoning Pass@1 by up to 41% via targeted data selection and supports inference steering, outperforming human-characterized alternatives.