pith. sign in

← back to paper

Review history

arxiv: 2605.14498 · 2 revisions

GroupMemBench: Benchmarking LLM Agent Memory in Multi-Party Conversations

  1. 2026-05-20 UNVERDICTED MODERATE v0.9.0 novelty 7.0
    47379 ms 5833 in 1075 out 2026-05-20T21:26:33.940348+00:00
  2. 2026-05-15 CONDITIONAL LOW v0.9.0 novelty 8.0
    42557 ms 5602 in 1291 out 2026-05-15T01:50:50.981104+00:00