Pith Number

pith:RZ7367QX

pith:2025:RZ7367QXH25YUOKYP43PJPHK2O

not attested not anchored not stored refs resolved

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

Benjamin Coleman, Chi Wang, Derek Zhiyuan Cheng, Ed H. Chi, Fernando Pereira, Jingrui He, Mengting Ai, Noveen Sachdeva, Shuo Chen, Tianxin Wei, Wang-Cheng Kang, Xuying Ning, Yuanchen Bei, Yunzhe Li, Zhankui He

LLM agents achieve continual improvement on streaming tasks by using the ReMem pipeline to integrate reasoning, actions, and memory updates.

arxiv:2511.20857 v1 · 2025-11-25 · cs.CL · cs.AI

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{RZ7367QXH25YUOKYP43PJPHK2O}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

ReMem, an action-think-memory refine pipeline, tightly integrates reasoning, task actions, and memory updates to achieve continual improvement in LLM agents on streaming tasks.

C2weakest assumption

That the chosen sequential task streams and the implemented memory modules faithfully capture the dynamics of real-world continuous interactions where memory evolution is required, without hidden implementation biases affecting the comparisons.

C3one line summary

Evo-Memory is a new benchmark for self-evolving memory in LLM agents across task streams, with baseline ExpRAG and proposed ReMem method that integrates reasoning, actions, and memory updates for continual improvement.

References

299 extracted · 299 resolved · 36 Pith anchors

[1] Measuring Massive Multitask Language Understanding 2009 · arXiv:2009.03300

[2] International Conference on Learning Representations (ICLR) , year=

[3] Advances in Neural Information Processing Systems (NeurIPS) , year=

[4] Advances in Neural Information Processing Systems (NeurIPS) , year=

[5] International Conference on Machine Learning (ICML) , year=

Formal links

1 machine-checked theorem link

Cited by

37 papers in Pith

Improve Large Language Model Systems with User Logs

MemCoT: Test-Time Scaling through Memory-Driven Chain-of-Thought

Auto-Dreamer: Learning Offline Memory Consolidation for Language Agents

A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications

EvoMemBench: Benchmarking Agent Memory from a Self-Evolving Perspective

Receipt and verification

First computed	2026-05-17T23:39:19.896925Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

8e7fbf7e173ebb8a39587f36f4bcead394069ddcc3d7107926de73ed48948a0c

Aliases

arxiv: 2511.20857 · arxiv_version: 2511.20857v1 · doi: 10.48550/arxiv.2511.20857 · pith_short_12: RZ7367QXH25Y · pith_short_16: RZ7367QXH25YUOKY · pith_short_8: RZ7367QX

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/RZ7367QXH25YUOKYP43PJPHK2O \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8e7fbf7e173ebb8a39587f36f4bcead394069ddcc3d7107926de73ed48948a0c

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "91510619ee77a1e8bb1dbe58ace5f7ed30ee2190b6f2fa018506d5ca6f3c0544",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2025-11-25T21:08:07Z",
    "title_canon_sha256": "0d818bd916402e6653c573575779d522ab3b6ccd03e7edcf7f20c510c04d1e7e"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2511.20857",
    "kind": "arxiv",
    "version": 1
  }
}