pith. sign in

← back to paper

Review history

arxiv: 2606.17041 · 2 revisions

Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio

  1. 2026-06-30 UNVERDICTED LOW v0.9.1-grok novelty 7.0
    48867 ms 5740 in 1222 out 2026-06-30T10:19:58.891339+00:00
  2. 2026-06-27 UNVERDICTED LOW v0.9.1-grok novelty 7.0
    65992 ms 5740 in 1447 out 2026-06-27T03:45:00.596921+00:00