pith:YTJJXWC4
MemEvoBench: Benchmarking Safety Risks from Memory Misevolution in LLM Agents
Biased memory updates cause substantial safety degradation in LLM agents.
arxiv:2604.15774 v2 · 2026-04-17 · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{YTJJXWC4HXVEG23P7X3BFYPILP}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Experiments on representative models reveal substantial safety degradation under biased memory updates. Our analysis suggests that memory evolution is a significant contributor to these failures. Furthermore, static prompt-based defenses prove insufficient.
That the constructed mixed benign and misleading memory pools in multi-round interactions accurately simulate real-world memory evolution and its safety impacts in deployed LLM agents.
MemEvoBench is the first benchmark for long-horizon memory safety in LLM agents, using QA tasks across 7 domains and 36 risks plus workflow tasks with noisy tools to measure behavioral drift from biased memory updates.
Receipt and verification
| First computed | 2026-05-22T01:04:02.552031Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
c4d29bd85c3dea436b6ffdf612e1e85bf51ab9afe7c39cc7e13ec914e2f4561f
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/YTJJXWC4HXVEG23P7X3BFYPILP \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c4d29bd85c3dea436b6ffdf612e1e85bf51ab9afe7c39cc7e13ec914e2f4561f
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "ae9f0c8aee4c5a3d35f4f5fbc67d57b8c10684f2cc7ca1a8bb9be6c130951780",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-04-17T07:29:52Z",
"title_canon_sha256": "70f246af957e25df222467d9aa1bfeb9451a4ca0ee4974bcecddfbd4c7072abf"
},
"schema_version": "1.0",
"source": {
"id": "2604.15774",
"kind": "arxiv",
"version": 2
}
}