pith. machine review for the scientific record. sign in

← back to paper

Review history

arxiv: 2605.03546 · 2 revisions

ProgramBench: Can Language Models Rebuild Programs From Scratch?

  1. 2026-05-07 UNVERDICTED LOW v0.9.0 novelty 7.0
    32931 ms 5525 in 989 out 2026-05-07T15:58:57.184136+00:00
  2. 2026-05-07 UNVERDICTED LOW v0.9.0 novelty 7.0
    26713 ms 5503 in 1156 out 2026-05-07T01:36:08.115032+00:00