pith. sign in

← back to paper

Review history

arxiv: 2606.20724 · 2 revisions

When Web Agents Finish but Still Fail: Reproducible Triggers and Trace Diagnostics for Parallel Web Exploration

  1. 2026-06-30 UNVERDICTED LOW v0.9.1-grok novelty 5.0
    40147 ms 5774 in 1200 out 2026-06-30T10:29:10.634308+00:00
  2. 2026-06-27 CONDITIONAL MODERATE v0.9.1-grok novelty 7.0
    24391 ms 5774 in 1245 out 2026-06-27T00:20:36.509878+00:00