PEER applies GRPO reinforcement learning with a unified process-outcome reward model to structured empathetic reasoning steps on the SER dataset, yielding gains in empathy, strategy alignment, and human-likeness.
arXiv, abs/2505.02686
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
PEER: Unified Process-Outcome Reinforcement Learning for Structured Empathetic Reasoning
PEER applies GRPO reinforcement learning with a unified process-outcome reward model to structured empathetic reasoning steps on the SER dataset, yielding gains in empathy, strategy alignment, and human-likeness.