pith. sign in
Pith Number

pith:4GRNZWZP

pith:2026:4GRNZWZPMNCEH2VEYYDJZL7OUU
not attested not anchored not stored refs resolved

Useful Memories Become Faulty When Continuously Updated by LLMs

Bingxuan Li, Dianqi Li, Dylan Zhang, Hao Peng, Yanshan Lin, Yihang Sun, Zhengkun Wu

Consolidated memories from LLMs degrade over repeated updates and can perform worse than using no memory at all.

arxiv:2605.12978 v1 · 2026-05-13 · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{4GRNZWZPMNCEH2VEYYDJZL7OUU}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Consolidated memories produced by today's LLMs are often faulty even when derived from useful experiences. As consolidation proceeds, memory utility first rises, then degrades, and can fall below the no-memory baseline. More surprisingly, even when consolidating from ground-truth solutions, GPT-5.4 fails on 54% of a set of ARC-AGI problems it had previously solved without memory.

C2weakest assumption

That the observed degradation is caused by the consolidation step itself rather than by limitations specific to the tested models, tasks, or update schedules, and that the ARC-AGI Stream environment sufficiently represents real-world agent memory use.

C3one line summary

LLM-consolidated memories in agents degrade over continuous updates even from useful experiences, causing up to 54% failure on previously solved ARC-AGI problems, while episodic retention preserves accuracy.

References

15 extracted · 15 resolved · 0 Pith anchors

[1] URLhttps://arxiv.org/abs/2511.00162. Morris Moscovitch, Roberto Cabeza, Gordon Winocur, and Lynn Nadel. Episodic memory and beyond: The hippocampus and neocortex in transformation.Annual Review of Psy 2016 · doi:10.1145/3586183.3606763
[2] You may RETAIN entries by index, MERGE several into a cleaner entry, or DROP entries by omitting them from the output
[3] from_existing
[4] When to use: The task has two same-sized input grids and the output has the same height but double the width, arranged as a left-right concatenation. The left half reproduces the shape pattern from th
[5] reason" in your reply). You MUST pick one existing strategy -- no other action is accepted: B) **Use an existing strategy**: { 2024
Receipt and verification
First computed 2026-05-18T03:09:08.679222Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

e1a2dcdb2f634443eaa4c6069cafeea53f8cc8a6f9e73c1fcbd16f4da22d3000

Aliases

arxiv: 2605.12978 · arxiv_version: 2605.12978v1 · doi: 10.48550/arxiv.2605.12978 · pith_short_12: 4GRNZWZPMNCE · pith_short_16: 4GRNZWZPMNCEH2VE · pith_short_8: 4GRNZWZP
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/4GRNZWZPMNCEH2VEYYDJZL7OUU \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e1a2dcdb2f634443eaa4c6069cafeea53f8cc8a6f9e73c1fcbd16f4da22d3000
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "2f7370ef81a575c6e0e9d975fcbb16718b09c785774323377c7f21157b6ab71a",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-05-13T04:15:50Z",
    "title_canon_sha256": "8d110404a9f0bd78130fa0f993d6c99bcd8b4b87b34ffa6f80d90d055fc4044e"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12978",
    "kind": "arxiv",
    "version": 1
  }
}