pith. sign in

← back to paper

Review history

arxiv: 2605.03361 · 3 revisions

ReasonAudio: A Benchmark for Evaluating Reasoning Beyond Matching in Text-Audio Retrieval

  1. 2026-05-08 UNVERDICTED LOW v0.9.0 novelty 8.0
    40595 ms 5510 in 1333 out 2026-05-08T18:24:34.152635+00:00
  2. 2026-05-07 UNVERDICTED LOW v0.9.0 novelty 7.0
    40019 ms 5510 in 1093 out 2026-05-07T16:44:39.358093+00:00
  3. 2026-05-07 CONDITIONAL LOW v0.9.0 novelty 7.0
    29629 ms 5488 in 1143 out 2026-05-07T02:07:10.955686+00:00