PEARL is a pedagogically aligned RL framework using a controllable student simulator, generative reward model, and stable multi-objective scheme to train Socratic tutors that outperform other open-source models on benchmarks.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
PEARL: Training Socratic Tutors with Pedagogically Aligned Reinforcement Learning
PEARL is a pedagogically aligned RL framework using a controllable student simulator, generative reward model, and stable multi-objective scheme to train Socratic tutors that outperform other open-source models on benchmarks.