← back to paper
arxiv: 2605.14498 · 2 revisions
GroupMemBench: Benchmarking LLM Agent Memory in Multi-Party Conversations