Mixed-objective reward models underperform single-objective ones because shared neurons support one objective while negatively affecting the other, creating alignment tension.
The Colorful Future of LLM s: Evaluating and Improving LLM s as Emotional Supporters for Queer Youth
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it