pith. sign in

← back to paper

Review history

arxiv: 2605.10075 · 2 revisions

Active Testing of Large Language Models via Approximate Neyman Allocation

  1. 2026-05-20 UNVERDICTED LOW v0.9.0 novelty 7.0
    34771 ms 5689 in 1221 out 2026-05-20T23:01:09.742312+00:00
  2. 2026-05-12 UNVERDICTED LOW v0.9.0 novelty 6.0
    56697 ms 5458 in 1162 out 2026-05-12T04:00:05.146387+00:00