pith. machine review for the scientific record. sign in

← back to paper

Review history

arxiv: 2604.08178 · 2 revisions

Aligning Agents via Planning: A Benchmark for Trajectory-Level Reward Modeling

  1. 2026-05-12 UNVERDICTED LOW v0.9.0 novelty 7.0
    68428 ms 5572 in 1213 out 2026-05-12T02:55:14.323223+00:00
  2. 2026-05-10 UNVERDICTED LOW v0.9.0 novelty 7.0
    65738 ms 5572 in 1466 out 2026-05-10T17:08:37.242368+00:00