pith. machine review for the scientific record. sign in

← back to paper

Review history

arxiv: 2605.03862 · 2 revisions

Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards

  1. 2026-05-08 UNVERDICTED LOW v0.9.0 novelty 6.0
    37886 ms 5594 in 1251 out 2026-05-08T18:12:21.174348+00:00
  2. 2026-05-07 UNVERDICTED LOW v0.9.0 novelty 7.0
    58536 ms 5576 in 1658 out 2026-05-07T04:14:12.713028+00:00