← back to paper
arxiv: 2604.15774 · 2 revisions
MemEvoBench: Benchmarking Safety Risks from Memory Misevolution in LLM Agents