Review history
Distributional Process Reward Models: Calibrated Prediction of Future Rewards via Conditional Optimal Transport
-
2026-05-13 UNVERDICTED
-
2026-05-11 UNVERDICTED
Distributional Process Reward Models: Calibrated Prediction of Future Rewards via Conditional Optimal Transport