LLMs trained via rubric-based self-rewarding RL with GRPO enhanced feeling expression and sycophancy robustness but degraded truthful QA performance.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Decomposing annotation tasks using centers from centering theory reduces aggregate inferential load via a degrees-of-freedom model and enables better sub-task allocation.
citing papers explorer
-
When AI Says It Feels
LLMs trained via rubric-based self-rewarding RL with GRPO enhanced feeling expression and sycophancy robustness but degraded truthful QA performance.
-
Task Decomposition for Efficient Annotation
Decomposing annotation tasks using centers from centering theory reduces aggregate inferential load via a degrees-of-freedom model and enables better sub-task allocation.