Rethinking Memory as Continuously Evolving Connectivity

Baohua Dong; Buqiang Xu; Feiyu Xiong; Gang Yu; Guozhou Zheng; Hangcheng Zhu; Haofen Wang; Haoliang Cao; Huajun Chen; Jizhan Fang

arxiv: 2605.28773 · v1 · pith:ZGSJA7BHnew · submitted 2026-05-27 · 💻 cs.CL · cs.AI· cs.LG· cs.MA· cs.MM

Rethinking Memory as Continuously Evolving Connectivity

Jizhan Fang , Buqiang Xu , Zhixian Wang , Haoliang Cao , Xinle Deng , Baohua Dong , Hangcheng Zhu , Ruohui Huang

show 7 more authors

Gang Yu Ying Wei Guozhou Zheng Feiyu Xiong Haofen Wang Huajun Chen Ningyu Zhang

This is my paper

Pith reviewed 2026-06-29 12:23 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.LGcs.MAcs.MM

keywords LLM agentsmemory augmentationheterogeneous graphsevolving connectivityagentic environmentsdynamic memorytopology refinement

0 comments

The pith

Memory in LLM agents works better when modeled as a connectivity-evolving heterogeneous graph refined across three stages instead of a static repository.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that fixed memory representations and retrieval methods break down in dynamic agent environments because feedback and task changes continuously alter what should be remembered and how elements connect. FluxMem counters this by representing memory as a heterogeneous graph and evolving its topology in three stages: forming initial connections, refining them based on feedback, and consolidating over the long term. A single metric tracks generalizability and evolutionary maturity while the system repairs missing links, removes interfering ones, matches abstraction levels, and turns repeated successes into reusable circuits. Results on LoCoMo, Mind2Web, and GAIA show consistent gains, indicating that treating memory as evolving connectivity supports stronger adaptation without relying on preset structures.

Core claim

FluxMem models memory as a heterogeneous graph and progressively refines its topology through initial connection formation, feedback-driven refinement, and long-term consolidation. It repairs missing links, prunes interference, aligns abstraction granularity, and distills recurrent successful trajectories into reusable procedural circuits, guided by one metric for memory generalizability and evolutionary maturity, which produces state-of-the-art performance on LoCoMo, Mind2Web, and GAIA.

What carries the argument

The heterogeneous graph memory representation together with its three-stage progressive topology refinement process, guided by a single metric of generalizability and evolutionary maturity.

If this is right

Agents gain the ability to dynamically repair and prune memory connections in response to ongoing feedback and task variation.
Successful trajectories become reusable procedural circuits that reduce repeated computation in similar future tasks.
A single guiding metric for generalizability and maturity simplifies oversight of the memory evolution process.
Performance remains high across fundamentally different benchmarks that test complex agentic behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same graph-evolution approach could extend to non-agent LLM uses such as maintaining coherent long-context reasoning without fixed retrieval rules.
If the three-stage process scales without added cost, it offers a path to reduce reliance on periodic full retraining when environments shift.
Treating memory links as the primary object of change rather than stored content alone might generalize to other sequential decision systems.

Load-bearing premise

Progressively refining the topology of a heterogeneous graph memory through three stages guided by one metric will deliver reliable adaptation gains without instability or excessive cost.

What would settle it

If FluxMem fails to reach state-of-the-art results or shows performance instability on any of the LoCoMo, Mind2Web, or GAIA benchmarks under the described conditions, the central claim would not hold.

Figures

Figures reproduced from arXiv: 2605.28773 by Baohua Dong, Buqiang Xu, Feiyu Xiong, Gang Yu, Guozhou Zheng, Hangcheng Zhu, Haofen Wang, Haoliang Cao, Huajun Chen, Jizhan Fang, Ningyu Zhang, Ruohui Huang, Xinle Deng, Ying Wei, Zhixian Wang.

**Figure 1.** Figure 1: The failures of static memory systems. agents, memory effectiveness ultimately depends on whether the most useful memories can be accessed at each decision step, as sufficiently useful memory context substantially improves subtask success. We formalize such usefulness as a problem of memory connectivity. Drawing from cognitive science (Hebb, 2005; Frankland and Bontempi, 2005), we define memory as the lon… view at source ↗

**Figure 2.** Figure 2: The FluxMem architecture. Stages I and II operate online at a step-wise granularity. Stage III is conducted offline, aiming for immediate performance optimization and long-term memory consolidation, respectively. its full step-by-step trajectory τq = {(ot , at)} T t=1. The three layers are linked in a bottom-up order through two types of edges in E. First, during task execution, the agent retrieves relevan… view at source ↗

**Figure 3.** Figure 3: Detailed analysis of FluxMem components and evolution dynamics: (a) Ablation study of different stages [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Case Study. The key points have been highlighted in red. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

read the original abstract

Existing memory-augmented LLM agents often treat memory as a static repository with pre-defined representations and fixed retrieval pipelines, which is brittle in dynamic agentic environments where feedback, task variation, and heterogeneous signals continuously reshape what should be remembered and how it should be connected. To address this, we propose FluxMem, a connectivity-evolving memory framework that models memory as a heterogeneous graph and progressively refines its topology through three stages: initial connection formation, feedback-driven refinement, and long-term consolidation. During execution, FluxMem repairs missing links, prunes interference, aligns abstraction granularity, and distills recurrent successful trajectories into reusable procedural circuits, guided by one metric for memory generalizability and evolutionary maturity. Across three fundamentally distinct benchmarks including LoCoMo, Mind2Web, and GAIA, FluxMem achieves consistent state-of-the-art performance, demonstrating strong adaptation and generalization in complex agentic environments. The code will be open-sourced in https://github.com/zjunlp/LightMem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FluxMem's three-stage graph evolution for agent memory is a clear step from static setups, but the SOTA claims on three benchmarks rest on an undefined metric with no visible details or checks.

read the letter

FluxMem proposes modeling memory as a heterogeneous graph that refines itself through initial connection formation, feedback-driven refinement, and long-term consolidation. The process includes repairing links, pruning interference, aligning granularity, and distilling trajectories into circuits, all guided by one metric for generalizability and maturity.

The new part is the explicit three-stage connectivity evolution aimed at dynamic agent environments. This directly targets the brittleness of fixed repositories and retrieval pipelines when feedback and task variation occur, which the abstract identifies as a practical issue.

The paper does a reasonable job framing why static memory falls short in settings like those tested. The contrast with prior approaches is straightforward and relevant for agent work.

The soft spots center on evaluation. The guiding metric has no formula, definition, or sensitivity analysis. There are no ablations on the stages, no baseline comparisons, and no numbers or implementation specifics, even though consistent SOTA is claimed across LoCoMo, Mind2Web, and GAIA. Without those, it is not possible to confirm whether the refinement mechanism produces the gains or whether pruning introduces instability or excess cost. The stress-test note on the metric and lack of stability checks matches what is visible.

If the full paper supplies the metric computation, experimental breakdowns, and the promised open-source code with reproducible results, the framework becomes easier to assess. As it stands, the central claims are difficult to verify.

This is for researchers building or studying memory in LLM agents. Readers focused on dynamic adaptation in agentic systems would get the most from the graph refinement concept.

I would send it to peer review so the methods and data can be examined directly.

Referee Report

3 major / 0 minor

Summary. The manuscript proposes FluxMem, a memory-augmented LLM agent framework that models memory as a heterogeneous graph whose topology evolves continuously via three stages—initial connection formation, feedback-driven refinement, and long-term consolidation—while repairing links, pruning interference, aligning abstraction levels, and distilling trajectories. A single (unspecified) metric for generalizability and evolutionary maturity is said to guide all operations. The central empirical claim is consistent state-of-the-art performance across three distinct benchmarks (LoCoMo, Mind2Web, GAIA) demonstrating superior adaptation and generalization in dynamic agentic settings.

Significance. If the three-stage refinement mechanism and its guiding metric can be shown to produce stable gains without ad-hoc fitting or excessive cost, the work would offer a substantive alternative to static memory repositories in agent literature. The planned open-sourcing of code is a positive step toward reproducibility.

major comments (3)

[Abstract] Abstract: the claim that a single metric for 'memory generalizability and evolutionary maturity' reliably controls link repair, pruning, alignment, and distillation across three stages is load-bearing for the SOTA results, yet no definition, formula, or sensitivity analysis of this metric is supplied; without it the reported performance cannot be traced to the proposed mechanism.
[Abstract] Abstract: no ablation isolating the contribution of each of the three stages (initial formation, feedback-driven refinement, long-term consolidation) or quantifying instability introduced by pruning is presented, leaving open the possibility that observed gains arise from other unstated factors rather than the evolving-connectivity design.
[Abstract] Abstract: the benchmarks (LoCoMo, Mind2Web, GAIA) are described as 'fundamentally distinct,' but no baseline comparisons, metric definitions, or statistical significance tests are referenced, so it is impossible to verify that the 'consistent state-of-the-art' claim follows from the graph-evolution procedure.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to supply the requested details and analyses.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that a single metric for 'memory generalizability and evolutionary maturity' reliably controls link repair, pruning, alignment, and distillation across three stages is load-bearing for the SOTA results, yet no definition, formula, or sensitivity analysis of this metric is supplied; without it the reported performance cannot be traced to the proposed mechanism.

Authors: We agree that an explicit definition, formula, and sensitivity analysis of the guiding metric are required to trace performance to the mechanism. We will add these elements, including the precise formulation and sensitivity results, to the methods and experimental sections of the revised manuscript. revision: yes
Referee: [Abstract] Abstract: no ablation isolating the contribution of each of the three stages (initial formation, feedback-driven refinement, long-term consolidation) or quantifying instability introduced by pruning is presented, leaving open the possibility that observed gains arise from other unstated factors rather than the evolving-connectivity design.

Authors: We concur that stage-specific ablations and pruning instability analysis are necessary. We will incorporate these ablations and the associated instability quantification into the experiments section of the revised manuscript. revision: yes
Referee: [Abstract] Abstract: the benchmarks (LoCoMo, Mind2Web, GAIA) are described as 'fundamentally distinct,' but no baseline comparisons, metric definitions, or statistical significance tests are referenced, so it is impossible to verify that the 'consistent state-of-the-art' claim follows from the graph-evolution procedure.

Authors: We will expand the abstract and results discussion to explicitly reference the baseline comparisons, metric definitions, and statistical significance tests already computed on these benchmarks, thereby clarifying how the gains derive from the graph-evolution procedure. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical framework with no equations or derivations

full rationale

The paper presents FluxMem as a three-stage heterogeneous graph refinement process guided by an unspecified metric for generalizability, with performance claims on external benchmarks (LoCoMo, Mind2Web, GAIA). No equations, formal derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or description. The central claims are empirical and do not reduce by construction to inputs via self-definition or ansatz smuggling; they remain open to external validation or falsification.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5756 in / 1006 out tokens · 32898 ms · 2026-06-29T12:23:17.727115+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

69 extracted references · 55 canonical work pages · 30 internal anchors

[1]

online" 'onlinestring :=

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
[3]

Aadharsh Aadhithya A, Sachin Kumar S, and Soman K. P. 2024. https://arxiv.org/abs/2406.06124 Enhancing long-term memory using hierarchical aggregate tree for retrieval augmented generation . Preprint, arXiv:2406.06124

work page arXiv 2024
[4]

Huan ang Gao, Jiayi Geng, Wenyue Hua, Mengkang Hu, Xinzhe Juan, Hongzhang Liu, Shilong Liu, Jiahao Qiu, Xuan Qi, Yiran Wu, Hongru Wang, Han Xiao, Yuhang Zhou, Shaokun Zhang, Jiayi Zhang, Jinyu Xiang, Yixiong Fang, Qiwen Zhao, Dongrui Liu, and 8 others. 2026. https://arxiv.org/abs/2507.21046 A survey of self-evolving agents: What, when, how, and where to e...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

Ali Behrouz, Meisam Razaviyayn, Peilin Zhong, and Vahab Mirrokni. 2025 a . https://arxiv.org/abs/2504.13173 It's all connected: A journey through test-time memorization, attentional bias, retention, and online optimization . Preprint, arXiv:2504.13173

work page arXiv 2025
[6]

Ali Behrouz, Meisam Razaviyayn, Peilin Zhong, and Vahab Mirrokni. 2025 b . https://arxiv.org/abs/2512.24695 Nested learning: The illusion of deep learning architectures . Preprint, arXiv:2512.24695

work page arXiv 2025
[7]

Ali Behrouz, Peilin Zhong, and Vahab Mirrokni. 2024. https://arxiv.org/abs/2501.00663 Titans: Learning to memorize at test time . Preprint, arXiv:2501.00663

work page internal anchor Pith review Pith/arXiv arXiv 2024
[8]

Zhicheng Cai, Xinyuan Guo, Yu Pei, Jiangtao Feng, Jinsong Su, Jiangjie Chen, Ya-Qin Zhang, Wei-Ying Ma, Mingxuan Wang, and Hao Zhou. 2025. https://arxiv.org/abs/2511.06449 Flex: Continuous agent evolution via forward learning from experience . Preprint, arXiv:2511.06449

work page arXiv 2025
[9]

Zouying Cao, Jiaji Deng, Li Yu, Weikang Zhou, Zhaoyang Liu, Bolin Ding, and Hai Zhao. 2025. https://arxiv.org/abs/2512.10696 Remember me, refine me: A dynamic procedural memory framework for experience-driven agent evolution . Preprint, arXiv:2512.10696

work page internal anchor Pith review Pith/arXiv arXiv 2025
[10]

Ding Chen, Simin Niu, Kehang Li, Peng Liu, Xiangping Zheng, Bo Tang, Xinchi Li, Feiyu Xiong, and Zhiyu Li. 2026 a . https://arxiv.org/abs/2511.03506 Halumem: Evaluating hallucinations in memory systems of agents . Preprint, arXiv:2511.03506

work page arXiv 2026
[11]

Yining Chen, Jihao Zhao, Bo Tang, Haofen Wang, Yue Zhang, Fei Huang, Feiyu Xiong, and Zhiyu Li. 2026 b . https://arxiv.org/abs/2605.09530 Memprivacy: Privacy-preserving personalized memory management for edge-cloud agents . Preprint, arXiv:2605.09530

work page internal anchor Pith review Pith/arXiv arXiv 2026
[12]

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav. 2025. https://arxiv.org/abs/2504.19413 Mem0: Building production-ready ai agents with scalable long-term memory . Preprint, arXiv:2504.19413

work page internal anchor Pith review Pith/arXiv arXiv 2025
[13]

Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Sam Stevens, Boshi Wang, Huan Sun, and Yu Su. 2023. Mind2web: Towards a generalist agent for the web. Advances in Neural Information Processing Systems, 36:28091--28114

2023
[14]

Jinyuan Fang, Yanwen Peng, Xi Zhang, Yingxu Wang, Xinhao Yi, Guibin Zhang, Yi Xu, Bin Wu, Siwei Liu, Zihao Li, Zhaochun Ren, Nikos Aletras, Xi Wang, Han Zhou, and Zaiqiao Meng. 2025 a . https://arxiv.org/abs/2508.07407 A comprehensive survey of self-evolving ai agents: A new paradigm bridging foundation models and lifelong agentic systems . Preprint, arXi...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[15]

Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu, Shumin Deng, Yunzhi Yao, Mengru Wang, Shuofei Qiao, and 1 others. 2025 b . Lightmem: Lightweight and efficient memory-augmented generation. arXiv preprint arXiv:2510.18866

work page internal anchor Pith review Pith/arXiv arXiv 2025
[16]

Runnan Fang, Yuan Liang, Xiaobin Wang, Jialong Wu, Shuofei Qiao, Pengjun Xie, Fei Huang, Huajun Chen, and Ningyu Zhang. 2026. https://arxiv.org/abs/2508.06433 Memp: Exploring agent procedural memory . Preprint, arXiv:2508.06433

work page internal anchor Pith review Pith/arXiv arXiv 2026
[17]

Adam Fourney, Gagan Bansal, Hussein Mozannar, Cheng Tan, Eduardo Salinas, Erkang, Zhu, Friederike Niedtner, Grace Proebsting, Griffin Bassman, Jack Gerrits, Jacob Alber, Peter Chang, Ricky Loynd, Robert West, Victor Dibia, Ahmed Awadallah, Ece Kamar, Rafah Hosn, and Saleema Amershi. 2024. https://arxiv.org/abs/2411.04468 Magentic-one: A generalist multi-a...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[18]

Paul W Frankland and Bruno Bontempi. 2005. The organization of recent and remote memories. Nature reviews neuroscience, 6(2):119--130

2005
[19]

Bernal Jiménez Gutiérrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, and Yu Su. 2025 a . https://arxiv.org/abs/2405.14831 Hipporag: Neurobiologically inspired long-term memory for large language models . Preprint, arXiv:2405.14831

work page arXiv 2025
[20]

Bernal Jiménez Gutiérrez, Yiheng Shu, Weijian Qi, Sizhe Zhou, and Yu Su. 2025 b . https://arxiv.org/abs/2502.14802 From rag to memory: Non-parametric continual learning for large language models . Preprint, arXiv:2502.14802

work page internal anchor Pith review Pith/arXiv arXiv 2025
[21]

Retrieval-Augmented Generation with Graphs (GraphRAG)

Haoyu Han, Yu Wang, Harry Shomer, Kai Guo, Jiayuan Ding, Yongjia Lei, Mahantesh Halappanavar, Ryan A. Rossi, Subhabrata Mukherjee, Xianfeng Tang, Qi He, Zhigang Hua, Bo Long, Tong Zhao, Neil Shah, Amin Javari, Yinglong Xia, and Jiliang Tang. 2025. https://arxiv.org/abs/2501.00309 Retrieval-augmented generation with graphs (graphrag) . Preprint, arXiv:2501.00309

work page internal anchor Pith review Pith/arXiv arXiv 2025
[22]

Donald Olding Hebb. 2005. The organization of behavior: A neuropsychological theory. Psychology press

2005
[23]

Chuanrui Hu, Xingze Gao, Zuyi Zhou, Dannong Xu, Yi Bai, Xintong Li, Hui Zhang, Tong Li, Chong Zhang, Lidong Bing, and 1 others. 2026 a . Evermemos: A self-organizing memory operating system for structured long-horizon reasoning. arXiv preprint arXiv:2601.02163

work page arXiv 2026
[24]

Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, Zhongxiang Sun, Yutao Zhu, Hao Sun, Boci Peng, and 28 others. 2026 b . https://arxiv.org/abs/2512.13564 Memory in the age of ai agents . Preprint, arXiv:2512.13564

work page internal anchor Pith review Pith/arXiv arXiv 2026
[25]

Bowen Jiang, Yuan Yuan, Maohao Shen, Zhuoqun Hao, Zhangchen Xu, Zichen Chen, Ziyi Liu, Anvesh Rao Vijjini, Jiashu He, Hanchao Yu, Radha Poovendran, Gregory Wornell, Lyle Ungar, Dan Roth, Sihao Chen, and Camillo Jose Taylor. 2025. https://arxiv.org/abs/2512.06688 Personamem-v2: Towards personalized intelligence via learning implicit user personas and agent...

work page arXiv 2025
[26]

Jiazheng Kang, Mingming Ji, Zhe Zhao, and Ting Bai. 2025. Memory os of ai agent. arXiv preprint arXiv:2506.06326

work page arXiv 2025
[27]

AM Clare Kelly and Hugh Garavan. 2005. Human functional neuroimaging of brain changes associated with practice. Cerebral cortex, 15(8):1089--1102

2005
[28]

Yitao Liu, Chenglei Si, Karthik Narasimhan, and Shunyu Yao. 2025. https://arxiv.org/abs/2506.06698 Contextual experience replay for self-improvement of language agents . Preprint, arXiv:2506.06698

work page arXiv 2025
[29]

Lin Long, Yichen He, Wentao Ye, Yiyuan Pan, Yuan Lin, Hang Li, Junbo Zhao, and Wei Li. 2025. https://arxiv.org/abs/2508.09736 Seeing, listening, remembering, and reasoning: A multimodal agent with long-term memory . Preprint, arXiv:2508.09736

work page arXiv 2025
[30]

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. 2024. Evaluating very long-term conversational memory of llm agents. arXiv preprint arXiv:2402.17753

work page internal anchor Pith review Pith/arXiv arXiv 2024
[31]

Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Baolong Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, Chenlin Zhou, Jiayi Mao, Tianze Xia, Jiafeng Guo, and Shenghua Liu. 2025. https://arxiv.org/abs/2507.13334 A survey of context engineering for large language models . Preprint, arXiv:2507.13334

work page internal anchor Pith review Pith/arXiv arXiv 2025
[32]

Gr \'e goire Mialon, Cl \'e mentine Fourrier, Thomas Wolf, Yann LeCun, and Thomas Scialom. 2023. Gaia: a benchmark for general ai assistants. In The Twelfth International Conference on Learning Representations

2023
[33]

Jiayan Nan, Wenquan Ma, Wenlong Wu, and Yize Chen. 2025. Nemori: Self-organizing agent memory inspired by cognitive science. arXiv preprint arXiv:2508.03341

work page internal anchor Pith review Pith/arXiv arXiv 2025
[34]

OpenAI. 2024. https://openai.com/index/introducing-deep-research/ deepresearch

2024
[35]

ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

Siru Ouyang, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long T. Le, Samira Daruki, Xiangru Tang, Vishy Tirumalashetty, George Lee, Mahsan Rofouei, Hangfei Lin, Jiawei Han, Chen-Yu Lee, and Tomas Pfister. 2025. https://arxiv.org/abs/2509.25140 Reasoningbank: Scaling agent self-evolving with reasoning memory . Preprint, arXiv:2509.25140

work page internal anchor Pith review Pith/arXiv arXiv 2025
[36]

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. 2024. https://arxiv.org/abs/2310.08560 Memgpt: Towards llms as operating systems . Preprint, arXiv:2310.08560

work page internal anchor Pith review Pith/arXiv arXiv 2024
[37]

Daiyi Peng. 2023. https://github.com/google/langfun Langfun

2023
[38]

Shihao Qi, Jie Ma, Rui Xing, Wei Guo, Xiao Huang, Zhitao Gao, Jianhao Deng, Jun Liu, Lingling Zhang, Bifan Wei, Boqian Yang, Pinghui Wang, Jianwen Sun, Jing Tao, Yaqiang Wu, Hui Liu, Yu Yao, and Tongliang Liu. 2026. https://arxiv.org/abs/2605.14892 Beyond individual intelligence: Surveying collaboration, failure attribution, and self-evolution in llm-base...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[39]

Tianrui Qin, Qianben Chen, Sinuo Wang, He Xing, King Zhu, He Zhu, Dingfeng Shi, Xinxin Liu, Ge Zhang, Jiaheng Liu, Yuchen Eleanor Jiang, Xitong Gao, and Wangchunshu Zhou. 2025. https://arxiv.org/abs/2509.25301 Flash-searcher: Fast and effective web agents via dag-based parallel execution . Preprint, arXiv:2509.25301

work page arXiv 2025
[40]

Jiahao Qiu, Xuan Qi, Tongcheng Zhang, Xinzhe Juan, Jiacheng Guo, Yifu Lu, Yimin Wang, Zixin Yao, Qihan Ren, Xun Jiang, Xing Zhou, Dongrui Liu, Ling Yang, Yue Wu, Kaixuan Huang, Shilong Liu, Hongru Wang, and Mengdi Wang. 2025. https://arxiv.org/abs/2505.20286 Alita: Generalist agent enabling scalable agentic reasoning with minimal predefinition and maximal...

work page arXiv 2025
[41]

Preston Rasmussen, Pavlo Paliychuk, Travis Beauvais, Jack Ryan, and Daniel Chalef. 2025. https://arxiv.org/abs/2501.13956 Zep: A temporal knowledge graph architecture for agent memory . Preprint, arXiv:2501.13956

work page internal anchor Pith review Pith/arXiv arXiv 2025
[42]

Aymeric Roucher, Albert Villanova del Moral, Thomas Wolf, Leandro von Werra, and Erik Kaunismäki. 2025. `smolagents`: a smol library to build great agentic systems. https://github.com/huggingface/smolagents

2025
[43]

Yuchen Shi, Yuzheng Cai, Siqi Cai, Zihan Xu, Lichao Chen, Yulei Qin, Zhijian Zhou, Xiang Fei, Chaofan Qiu, Xiaoyu Tan, Gang Li, Zongyi Li, Haojia Lin, Guocan Cai, Yong Mao, Yunsheng Wu, Ke Li, and Xing Sun. 2025. https://arxiv.org/abs/2512.24615 Youtu-agent: Scaling agent productivity with automated generation and hybrid policy optimization . Preprint, ar...

work page arXiv 2025
[44]

Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36:8634--8652

2023
[45]

Mirac Suzgun, Mert Yuksekgonul, Federico Bianchi, Dan Jurafsky, and James Zou. 2026. https://doi.org/10.18653/v1/2026.eacl-long.333 Dynamic cheatsheet: Test-time learning with adaptive memory . In Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputational L inguistics (Volume 1: Long Papers) , pages 7080--7106, Raba...

work page doi:10.18653/v1/2026.eacl-long.333 2026
[46]

Xiangru Tang, Tianyu Hu, Muyang Ye, Yanjun Shao, Xunjian Yin, Siru Ouyang, Wangchunshu Zhou, Pan Lu, Zhuosheng Zhang, Yilun Zhao, Arman Cohan, and Mark Gerstein. 2025 a . https://arxiv.org/abs/2501.06590 Chemagent: Self-updating library in large language models improves chemical reasoning . Preprint, arXiv:2501.06590

work page arXiv 2025
[47]

Xiangru Tang, Tianrui Qin, Tianhao Peng, Ziyang Zhou, Daniel Shao, Tingting Du, Xinming Wei, Peng Xia, Fang Wu, He Zhu, Ge Zhang, Jiaheng Liu, Xingyao Wang, Sirui Hong, Chenglin Wu, Hao Cheng, Chi Wang, and Wangchunshu Zhou. 2025 b . https://arxiv.org/abs/2507.06229 Agent kb: Leveraging cross-domain experience for agentic problem solving . Preprint, arXiv...

work page arXiv 2025
[48]

Chenxi Wang, Zhuoyun Yu, Xin Xie, Wuguannan Yao, Runnan Fang, Shuofei Qiao, Kexin Cao, Guozhou Zheng, Xiang Qi, Peng Zhang, and Shumin Deng. 2026. https://arxiv.org/abs/2604.04804 Skillx: Automatically constructing skill knowledge bases for agents . Preprint, arXiv:2604.04804

work page internal anchor Pith review Pith/arXiv arXiv 2026
[49]

Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, and Huajun Chen. 2024 a . Wise: Rethinking the knowledge memory for lifelong model editing of large language models. Advances in Neural Information Processing Systems, 37:53764--53797

2024
[50]

Yu Wang and Xi Chen. 2025. Mirix: Multi-agent memory system for llm-based agents. arXiv preprint arXiv:2507.07957

work page internal anchor Pith review Pith/arXiv arXiv 2025
[51]

Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, and Julian McAuley. 2024 b . https://arxiv.org/abs/2402.04624 Memoryllm: Towards self-updatable large language models . Preprint, arXiv:2402.04624

work page arXiv 2024
[52]

Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, and Graham Neubig. 2024 c . Agent workflow memory. arXiv preprint arXiv:2409.07429

work page internal anchor Pith review Pith/arXiv arXiv 2024
[53]

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

Tianxin Wei, Noveen Sachdeva, Benjamin Coleman, Zhankui He, Yuanchen Bei, Xuying Ning, Mengting Ai, Yunzhe Li, Jingrui He, Ed H. Chi, Chi Wang, Shuo Chen, Fernando Pereira, Wang-Cheng Kang, and Derek Zhiyuan Cheng. 2025. https://arxiv.org/abs/2511.20857 Evo-memory: Benchmarking llm agent test-time learning with self-evolving memory . Preprint, arXiv:2511.20857

work page internal anchor Pith review Pith/arXiv arXiv 2025
[54]

Rong Wu, Xiaoman Wang, Jianbiao Mei, Pinlong Cai, Daocheng Fu, Cheng Yang, Licheng Wen, Xuemeng Yang, Yufan Shen, Yuxin Wang, and Botian Shi. 2025. https://arxiv.org/abs/2510.16079 Evolver: Self-evolving llm agents through an experience-driven lifecycle . Preprint, arXiv:2510.16079

work page internal anchor Pith review Pith/arXiv arXiv 2025
[55]

Peng Xia, Kaide Zeng, Jiaqi Liu, Can Qin, Fang Wu, Yiyang Zhou, Caiming Xiong, and Huaxiu Yao. 2025. https://arxiv.org/abs/2511.16043 Agent0: Unleashing self-evolving agents from zero data via tool-integrated reasoning . Preprint, arXiv:2511.16043

work page arXiv 2025
[56]

Buqiang Xu, Yijun Chen, Jizhan Fang, Ruobin Zhong, Yunzhi Yao, Yuqi Zhu, Lun Du, and Shumin Deng. 2026. https://doi.org/10.48550/ARXIV.2604.21748 Structmem: Structured memory for long-horizon behavior in llms . CoRR, abs/2604.21748

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.21748 2026
[57]

Wujiang Xu, Kai Mei, Hang Gao, Juntao Tan, Zujie Liang, and Yongfeng Zhang. 2025. A-mem: Agentic memory for llm agents. arXiv preprint arXiv:2502.12110

work page internal anchor Pith review Pith/arXiv arXiv 2025
[58]

Ke Yang, Zixi Chen, Xuan He, Jize Jiang, Michel Galley, Chenglong Wang, Jianfeng Gao, Jiawei Han, and ChengXiang Zhai. 2026. https://arxiv.org/abs/2603.03296 Plugmem: A task-agnostic plugin memory module for llm agents . Preprint, arXiv:2603.03296

work page arXiv 2026
[59]

Chongrui Ye, Yuxiang Liu, Yu Wang, Haofei Yu, Yining Zhao, Ge Liu, Julian McAuley, and Jiaxuan You. 2026. https://arxiv.org/abs/2605.20616 Auto-dreamer: Learning offline memory consolidation for language agents . Preprint, arXiv:2605.20616

work page internal anchor Pith review Pith/arXiv arXiv 2026
[60]

Shicheng Ye, Chao Yu, Kaiqiang Ke, Chengdong Xu, and Yinqi Wei. 2025. https://arxiv.org/abs/2509.12810 H ^2 r: Hierarchical hindsight reflection for multi-task llm agents . Preprint, arXiv:2509.12810

work page arXiv 2025
[61]

Yunpeng Zhai, Shuchang Tao, Cheng Chen, Anni Zou, Ziqian Chen, Qingxu Fu, Shinji Mai, Li Yu, Jiaji Deng, Zouying Cao, Zhaoyang Liu, Bolin Ding, and Jingren Zhou. 2025. https://arxiv.org/abs/2511.10395 Agentevolver: Towards efficient self-evolving agent system . Preprint, arXiv:2511.10395

work page arXiv 2025
[62]

Guibin Zhang, Muxin Fu, Guancheng Wan, Miao Yu, Kun Wang, and Shuicheng Yan. 2025 a . https://arxiv.org/abs/2506.07398 G-memory: Tracing hierarchical memory for multi-agent systems . Preprint, arXiv:2506.07398

work page arXiv 2025
[63]

Guibin Zhang, Haotian Ren, Chong Zhan, Zhenhong Zhou, Junhao Wang, He Zhu, Wangchunshu Zhou, and Shuicheng Yan. 2025 b . https://arxiv.org/abs/2512.18746 Memevolve: Meta-evolution of agent memory systems . Preprint, arXiv:2512.18746

work page internal anchor Pith review Pith/arXiv arXiv 2025
[64]

Haozhen Zhang, Quanyu Long, Jianzhu Bao, Tao Feng, Weizhi Zhang, Haodong Yue, and Wenya Wang. 2026 a . https://arxiv.org/abs/2602.02474 Memskill: Learning and evolving memory skills for self-evolving agents . Preprint, arXiv:2602.02474

work page internal anchor Pith review Pith/arXiv arXiv 2026
[65]

Shengtao Zhang, Jiaqian Wang, Ruiwen Zhou, Junwei Liao, Yuchen Feng, Weinan Zhang, Ying Wen, Zhiyu Li, Feiyu Xiong, Yutao Qi, Bo Tang, and Muning Wen. 2026 b . https://arxiv.org/abs/2601.03192 Memrl: Self-evolving agents via runtime reinforcement learning on episodic memory . Preprint, arXiv:2601.03192

work page internal anchor Pith review Pith/arXiv arXiv 2026
[66]

Zeyu Zhang, Quanyu Dai, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Jieming Zhu, Zhenhua Dong, and Ji-Rong Wen. 2025 c . A survey on the memory mechanism of large language model-based agents. ACM Transactions on Information Systems, 43(6):1--47

2025
[67]

Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-Jin Liu, and Gao Huang. 2024. Expel: Llm agents are experiential learners. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19632--19642

2024
[68]

Huichi Zhou, Yihang Chen, Siyuan Guo, Xue Yan, Kin Hei Lee, Zihan Wang, Ka Yiu Lee, Guchun Zhang, Kun Shao, Linyi Yang, and Jun Wang. 2025. https://arxiv.org/abs/2508.16153 Memento: Fine-tuning llm agents without fine-tuning llms . Preprint, arXiv:2508.16153

work page arXiv 2025
[69]

Adam Zweiger, Jyothish Pari, Han Guo, Ekin Akyürek, Yoon Kim, and Pulkit Agrawal. 2025. https://arxiv.org/abs/2506.10943 Self-adapting language models . Preprint, arXiv:2506.10943

work page arXiv 2025

[1] [1]

online" 'onlinestring :=

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

[3] [3]

Aadharsh Aadhithya A, Sachin Kumar S, and Soman K. P. 2024. https://arxiv.org/abs/2406.06124 Enhancing long-term memory using hierarchical aggregate tree for retrieval augmented generation . Preprint, arXiv:2406.06124

work page arXiv 2024

[4] [4]

Huan ang Gao, Jiayi Geng, Wenyue Hua, Mengkang Hu, Xinzhe Juan, Hongzhang Liu, Shilong Liu, Jiahao Qiu, Xuan Qi, Yiran Wu, Hongru Wang, Han Xiao, Yuhang Zhou, Shaokun Zhang, Jiayi Zhang, Jinyu Xiang, Yixiong Fang, Qiwen Zhao, Dongrui Liu, and 8 others. 2026. https://arxiv.org/abs/2507.21046 A survey of self-evolving agents: What, when, how, and where to e...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[5] [5]

Ali Behrouz, Meisam Razaviyayn, Peilin Zhong, and Vahab Mirrokni. 2025 a . https://arxiv.org/abs/2504.13173 It's all connected: A journey through test-time memorization, attentional bias, retention, and online optimization . Preprint, arXiv:2504.13173

work page arXiv 2025

[6] [6]

Ali Behrouz, Meisam Razaviyayn, Peilin Zhong, and Vahab Mirrokni. 2025 b . https://arxiv.org/abs/2512.24695 Nested learning: The illusion of deep learning architectures . Preprint, arXiv:2512.24695

work page arXiv 2025

[7] [7]

Ali Behrouz, Peilin Zhong, and Vahab Mirrokni. 2024. https://arxiv.org/abs/2501.00663 Titans: Learning to memorize at test time . Preprint, arXiv:2501.00663

work page internal anchor Pith review Pith/arXiv arXiv 2024

[8] [8]

Zhicheng Cai, Xinyuan Guo, Yu Pei, Jiangtao Feng, Jinsong Su, Jiangjie Chen, Ya-Qin Zhang, Wei-Ying Ma, Mingxuan Wang, and Hao Zhou. 2025. https://arxiv.org/abs/2511.06449 Flex: Continuous agent evolution via forward learning from experience . Preprint, arXiv:2511.06449

work page arXiv 2025

[9] [9]

Zouying Cao, Jiaji Deng, Li Yu, Weikang Zhou, Zhaoyang Liu, Bolin Ding, and Hai Zhao. 2025. https://arxiv.org/abs/2512.10696 Remember me, refine me: A dynamic procedural memory framework for experience-driven agent evolution . Preprint, arXiv:2512.10696

work page internal anchor Pith review Pith/arXiv arXiv 2025

[10] [10]

Ding Chen, Simin Niu, Kehang Li, Peng Liu, Xiangping Zheng, Bo Tang, Xinchi Li, Feiyu Xiong, and Zhiyu Li. 2026 a . https://arxiv.org/abs/2511.03506 Halumem: Evaluating hallucinations in memory systems of agents . Preprint, arXiv:2511.03506

work page arXiv 2026

[11] [11]

Yining Chen, Jihao Zhao, Bo Tang, Haofen Wang, Yue Zhang, Fei Huang, Feiyu Xiong, and Zhiyu Li. 2026 b . https://arxiv.org/abs/2605.09530 Memprivacy: Privacy-preserving personalized memory management for edge-cloud agents . Preprint, arXiv:2605.09530

work page internal anchor Pith review Pith/arXiv arXiv 2026

[12] [12]

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav. 2025. https://arxiv.org/abs/2504.19413 Mem0: Building production-ready ai agents with scalable long-term memory . Preprint, arXiv:2504.19413

work page internal anchor Pith review Pith/arXiv arXiv 2025

[13] [13]

Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Sam Stevens, Boshi Wang, Huan Sun, and Yu Su. 2023. Mind2web: Towards a generalist agent for the web. Advances in Neural Information Processing Systems, 36:28091--28114

2023

[14] [14]

Jinyuan Fang, Yanwen Peng, Xi Zhang, Yingxu Wang, Xinhao Yi, Guibin Zhang, Yi Xu, Bin Wu, Siwei Liu, Zihao Li, Zhaochun Ren, Nikos Aletras, Xi Wang, Han Zhou, and Zaiqiao Meng. 2025 a . https://arxiv.org/abs/2508.07407 A comprehensive survey of self-evolving ai agents: A new paradigm bridging foundation models and lifelong agentic systems . Preprint, arXi...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[15] [15]

Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu, Shumin Deng, Yunzhi Yao, Mengru Wang, Shuofei Qiao, and 1 others. 2025 b . Lightmem: Lightweight and efficient memory-augmented generation. arXiv preprint arXiv:2510.18866

work page internal anchor Pith review Pith/arXiv arXiv 2025

[16] [16]

Runnan Fang, Yuan Liang, Xiaobin Wang, Jialong Wu, Shuofei Qiao, Pengjun Xie, Fei Huang, Huajun Chen, and Ningyu Zhang. 2026. https://arxiv.org/abs/2508.06433 Memp: Exploring agent procedural memory . Preprint, arXiv:2508.06433

work page internal anchor Pith review Pith/arXiv arXiv 2026

[17] [17]

Adam Fourney, Gagan Bansal, Hussein Mozannar, Cheng Tan, Eduardo Salinas, Erkang, Zhu, Friederike Niedtner, Grace Proebsting, Griffin Bassman, Jack Gerrits, Jacob Alber, Peter Chang, Ricky Loynd, Robert West, Victor Dibia, Ahmed Awadallah, Ece Kamar, Rafah Hosn, and Saleema Amershi. 2024. https://arxiv.org/abs/2411.04468 Magentic-one: A generalist multi-a...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[18] [18]

Paul W Frankland and Bruno Bontempi. 2005. The organization of recent and remote memories. Nature reviews neuroscience, 6(2):119--130

2005

[19] [19]

Bernal Jiménez Gutiérrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, and Yu Su. 2025 a . https://arxiv.org/abs/2405.14831 Hipporag: Neurobiologically inspired long-term memory for large language models . Preprint, arXiv:2405.14831

work page arXiv 2025

[20] [20]

Bernal Jiménez Gutiérrez, Yiheng Shu, Weijian Qi, Sizhe Zhou, and Yu Su. 2025 b . https://arxiv.org/abs/2502.14802 From rag to memory: Non-parametric continual learning for large language models . Preprint, arXiv:2502.14802

work page internal anchor Pith review Pith/arXiv arXiv 2025

[21] [21]

Retrieval-Augmented Generation with Graphs (GraphRAG)

Haoyu Han, Yu Wang, Harry Shomer, Kai Guo, Jiayuan Ding, Yongjia Lei, Mahantesh Halappanavar, Ryan A. Rossi, Subhabrata Mukherjee, Xianfeng Tang, Qi He, Zhigang Hua, Bo Long, Tong Zhao, Neil Shah, Amin Javari, Yinglong Xia, and Jiliang Tang. 2025. https://arxiv.org/abs/2501.00309 Retrieval-augmented generation with graphs (graphrag) . Preprint, arXiv:2501.00309

work page internal anchor Pith review Pith/arXiv arXiv 2025

[22] [22]

Donald Olding Hebb. 2005. The organization of behavior: A neuropsychological theory. Psychology press

2005

[23] [23]

Chuanrui Hu, Xingze Gao, Zuyi Zhou, Dannong Xu, Yi Bai, Xintong Li, Hui Zhang, Tong Li, Chong Zhang, Lidong Bing, and 1 others. 2026 a . Evermemos: A self-organizing memory operating system for structured long-horizon reasoning. arXiv preprint arXiv:2601.02163

work page arXiv 2026

[24] [24]

Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, Zhongxiang Sun, Yutao Zhu, Hao Sun, Boci Peng, and 28 others. 2026 b . https://arxiv.org/abs/2512.13564 Memory in the age of ai agents . Preprint, arXiv:2512.13564

work page internal anchor Pith review Pith/arXiv arXiv 2026

[25] [25]

Bowen Jiang, Yuan Yuan, Maohao Shen, Zhuoqun Hao, Zhangchen Xu, Zichen Chen, Ziyi Liu, Anvesh Rao Vijjini, Jiashu He, Hanchao Yu, Radha Poovendran, Gregory Wornell, Lyle Ungar, Dan Roth, Sihao Chen, and Camillo Jose Taylor. 2025. https://arxiv.org/abs/2512.06688 Personamem-v2: Towards personalized intelligence via learning implicit user personas and agent...

work page arXiv 2025

[26] [26]

Jiazheng Kang, Mingming Ji, Zhe Zhao, and Ting Bai. 2025. Memory os of ai agent. arXiv preprint arXiv:2506.06326

work page arXiv 2025

[27] [27]

AM Clare Kelly and Hugh Garavan. 2005. Human functional neuroimaging of brain changes associated with practice. Cerebral cortex, 15(8):1089--1102

2005

[28] [28]

Yitao Liu, Chenglei Si, Karthik Narasimhan, and Shunyu Yao. 2025. https://arxiv.org/abs/2506.06698 Contextual experience replay for self-improvement of language agents . Preprint, arXiv:2506.06698

work page arXiv 2025

[29] [29]

Lin Long, Yichen He, Wentao Ye, Yiyuan Pan, Yuan Lin, Hang Li, Junbo Zhao, and Wei Li. 2025. https://arxiv.org/abs/2508.09736 Seeing, listening, remembering, and reasoning: A multimodal agent with long-term memory . Preprint, arXiv:2508.09736

work page arXiv 2025

[30] [30]

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. 2024. Evaluating very long-term conversational memory of llm agents. arXiv preprint arXiv:2402.17753

work page internal anchor Pith review Pith/arXiv arXiv 2024

[31] [31]

Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Baolong Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, Chenlin Zhou, Jiayi Mao, Tianze Xia, Jiafeng Guo, and Shenghua Liu. 2025. https://arxiv.org/abs/2507.13334 A survey of context engineering for large language models . Preprint, arXiv:2507.13334

work page internal anchor Pith review Pith/arXiv arXiv 2025

[32] [32]

Gr \'e goire Mialon, Cl \'e mentine Fourrier, Thomas Wolf, Yann LeCun, and Thomas Scialom. 2023. Gaia: a benchmark for general ai assistants. In The Twelfth International Conference on Learning Representations

2023

[33] [33]

Jiayan Nan, Wenquan Ma, Wenlong Wu, and Yize Chen. 2025. Nemori: Self-organizing agent memory inspired by cognitive science. arXiv preprint arXiv:2508.03341

work page internal anchor Pith review Pith/arXiv arXiv 2025

[34] [34]

OpenAI. 2024. https://openai.com/index/introducing-deep-research/ deepresearch

2024

[35] [35]

ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

Siru Ouyang, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long T. Le, Samira Daruki, Xiangru Tang, Vishy Tirumalashetty, George Lee, Mahsan Rofouei, Hangfei Lin, Jiawei Han, Chen-Yu Lee, and Tomas Pfister. 2025. https://arxiv.org/abs/2509.25140 Reasoningbank: Scaling agent self-evolving with reasoning memory . Preprint, arXiv:2509.25140

work page internal anchor Pith review Pith/arXiv arXiv 2025

[36] [36]

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. 2024. https://arxiv.org/abs/2310.08560 Memgpt: Towards llms as operating systems . Preprint, arXiv:2310.08560

work page internal anchor Pith review Pith/arXiv arXiv 2024

[37] [37]

Daiyi Peng. 2023. https://github.com/google/langfun Langfun

2023

[38] [38]

Shihao Qi, Jie Ma, Rui Xing, Wei Guo, Xiao Huang, Zhitao Gao, Jianhao Deng, Jun Liu, Lingling Zhang, Bifan Wei, Boqian Yang, Pinghui Wang, Jianwen Sun, Jing Tao, Yaqiang Wu, Hui Liu, Yu Yao, and Tongliang Liu. 2026. https://arxiv.org/abs/2605.14892 Beyond individual intelligence: Surveying collaboration, failure attribution, and self-evolution in llm-base...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[39] [39]

Tianrui Qin, Qianben Chen, Sinuo Wang, He Xing, King Zhu, He Zhu, Dingfeng Shi, Xinxin Liu, Ge Zhang, Jiaheng Liu, Yuchen Eleanor Jiang, Xitong Gao, and Wangchunshu Zhou. 2025. https://arxiv.org/abs/2509.25301 Flash-searcher: Fast and effective web agents via dag-based parallel execution . Preprint, arXiv:2509.25301

work page arXiv 2025

[40] [40]

Jiahao Qiu, Xuan Qi, Tongcheng Zhang, Xinzhe Juan, Jiacheng Guo, Yifu Lu, Yimin Wang, Zixin Yao, Qihan Ren, Xun Jiang, Xing Zhou, Dongrui Liu, Ling Yang, Yue Wu, Kaixuan Huang, Shilong Liu, Hongru Wang, and Mengdi Wang. 2025. https://arxiv.org/abs/2505.20286 Alita: Generalist agent enabling scalable agentic reasoning with minimal predefinition and maximal...

work page arXiv 2025

[41] [41]

Preston Rasmussen, Pavlo Paliychuk, Travis Beauvais, Jack Ryan, and Daniel Chalef. 2025. https://arxiv.org/abs/2501.13956 Zep: A temporal knowledge graph architecture for agent memory . Preprint, arXiv:2501.13956

work page internal anchor Pith review Pith/arXiv arXiv 2025

[42] [42]

Aymeric Roucher, Albert Villanova del Moral, Thomas Wolf, Leandro von Werra, and Erik Kaunismäki. 2025. `smolagents`: a smol library to build great agentic systems. https://github.com/huggingface/smolagents

2025

[43] [43]

Yuchen Shi, Yuzheng Cai, Siqi Cai, Zihan Xu, Lichao Chen, Yulei Qin, Zhijian Zhou, Xiang Fei, Chaofan Qiu, Xiaoyu Tan, Gang Li, Zongyi Li, Haojia Lin, Guocan Cai, Yong Mao, Yunsheng Wu, Ke Li, and Xing Sun. 2025. https://arxiv.org/abs/2512.24615 Youtu-agent: Scaling agent productivity with automated generation and hybrid policy optimization . Preprint, ar...

work page arXiv 2025

[44] [44]

Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36:8634--8652

2023

[45] [45]

Mirac Suzgun, Mert Yuksekgonul, Federico Bianchi, Dan Jurafsky, and James Zou. 2026. https://doi.org/10.18653/v1/2026.eacl-long.333 Dynamic cheatsheet: Test-time learning with adaptive memory . In Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputational L inguistics (Volume 1: Long Papers) , pages 7080--7106, Raba...

work page doi:10.18653/v1/2026.eacl-long.333 2026

[46] [46]

Xiangru Tang, Tianyu Hu, Muyang Ye, Yanjun Shao, Xunjian Yin, Siru Ouyang, Wangchunshu Zhou, Pan Lu, Zhuosheng Zhang, Yilun Zhao, Arman Cohan, and Mark Gerstein. 2025 a . https://arxiv.org/abs/2501.06590 Chemagent: Self-updating library in large language models improves chemical reasoning . Preprint, arXiv:2501.06590

work page arXiv 2025

[47] [47]

Xiangru Tang, Tianrui Qin, Tianhao Peng, Ziyang Zhou, Daniel Shao, Tingting Du, Xinming Wei, Peng Xia, Fang Wu, He Zhu, Ge Zhang, Jiaheng Liu, Xingyao Wang, Sirui Hong, Chenglin Wu, Hao Cheng, Chi Wang, and Wangchunshu Zhou. 2025 b . https://arxiv.org/abs/2507.06229 Agent kb: Leveraging cross-domain experience for agentic problem solving . Preprint, arXiv...

work page arXiv 2025

[48] [48]

Chenxi Wang, Zhuoyun Yu, Xin Xie, Wuguannan Yao, Runnan Fang, Shuofei Qiao, Kexin Cao, Guozhou Zheng, Xiang Qi, Peng Zhang, and Shumin Deng. 2026. https://arxiv.org/abs/2604.04804 Skillx: Automatically constructing skill knowledge bases for agents . Preprint, arXiv:2604.04804

work page internal anchor Pith review Pith/arXiv arXiv 2026

[49] [49]

Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, and Huajun Chen. 2024 a . Wise: Rethinking the knowledge memory for lifelong model editing of large language models. Advances in Neural Information Processing Systems, 37:53764--53797

2024

[50] [50]

Yu Wang and Xi Chen. 2025. Mirix: Multi-agent memory system for llm-based agents. arXiv preprint arXiv:2507.07957

work page internal anchor Pith review Pith/arXiv arXiv 2025

[51] [51]

Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, and Julian McAuley. 2024 b . https://arxiv.org/abs/2402.04624 Memoryllm: Towards self-updatable large language models . Preprint, arXiv:2402.04624

work page arXiv 2024

[52] [52]

Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, and Graham Neubig. 2024 c . Agent workflow memory. arXiv preprint arXiv:2409.07429

work page internal anchor Pith review Pith/arXiv arXiv 2024

[53] [53]

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

Tianxin Wei, Noveen Sachdeva, Benjamin Coleman, Zhankui He, Yuanchen Bei, Xuying Ning, Mengting Ai, Yunzhe Li, Jingrui He, Ed H. Chi, Chi Wang, Shuo Chen, Fernando Pereira, Wang-Cheng Kang, and Derek Zhiyuan Cheng. 2025. https://arxiv.org/abs/2511.20857 Evo-memory: Benchmarking llm agent test-time learning with self-evolving memory . Preprint, arXiv:2511.20857

work page internal anchor Pith review Pith/arXiv arXiv 2025

[54] [54]

Rong Wu, Xiaoman Wang, Jianbiao Mei, Pinlong Cai, Daocheng Fu, Cheng Yang, Licheng Wen, Xuemeng Yang, Yufan Shen, Yuxin Wang, and Botian Shi. 2025. https://arxiv.org/abs/2510.16079 Evolver: Self-evolving llm agents through an experience-driven lifecycle . Preprint, arXiv:2510.16079

work page internal anchor Pith review Pith/arXiv arXiv 2025

[55] [55]

Peng Xia, Kaide Zeng, Jiaqi Liu, Can Qin, Fang Wu, Yiyang Zhou, Caiming Xiong, and Huaxiu Yao. 2025. https://arxiv.org/abs/2511.16043 Agent0: Unleashing self-evolving agents from zero data via tool-integrated reasoning . Preprint, arXiv:2511.16043

work page arXiv 2025

[56] [56]

Buqiang Xu, Yijun Chen, Jizhan Fang, Ruobin Zhong, Yunzhi Yao, Yuqi Zhu, Lun Du, and Shumin Deng. 2026. https://doi.org/10.48550/ARXIV.2604.21748 Structmem: Structured memory for long-horizon behavior in llms . CoRR, abs/2604.21748

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.21748 2026

[57] [57]

Wujiang Xu, Kai Mei, Hang Gao, Juntao Tan, Zujie Liang, and Yongfeng Zhang. 2025. A-mem: Agentic memory for llm agents. arXiv preprint arXiv:2502.12110

work page internal anchor Pith review Pith/arXiv arXiv 2025

[58] [58]

Ke Yang, Zixi Chen, Xuan He, Jize Jiang, Michel Galley, Chenglong Wang, Jianfeng Gao, Jiawei Han, and ChengXiang Zhai. 2026. https://arxiv.org/abs/2603.03296 Plugmem: A task-agnostic plugin memory module for llm agents . Preprint, arXiv:2603.03296

work page arXiv 2026

[59] [59]

Chongrui Ye, Yuxiang Liu, Yu Wang, Haofei Yu, Yining Zhao, Ge Liu, Julian McAuley, and Jiaxuan You. 2026. https://arxiv.org/abs/2605.20616 Auto-dreamer: Learning offline memory consolidation for language agents . Preprint, arXiv:2605.20616

work page internal anchor Pith review Pith/arXiv arXiv 2026

[60] [60]

Shicheng Ye, Chao Yu, Kaiqiang Ke, Chengdong Xu, and Yinqi Wei. 2025. https://arxiv.org/abs/2509.12810 H ^2 r: Hierarchical hindsight reflection for multi-task llm agents . Preprint, arXiv:2509.12810

work page arXiv 2025

[61] [61]

Yunpeng Zhai, Shuchang Tao, Cheng Chen, Anni Zou, Ziqian Chen, Qingxu Fu, Shinji Mai, Li Yu, Jiaji Deng, Zouying Cao, Zhaoyang Liu, Bolin Ding, and Jingren Zhou. 2025. https://arxiv.org/abs/2511.10395 Agentevolver: Towards efficient self-evolving agent system . Preprint, arXiv:2511.10395

work page arXiv 2025

[62] [62]

Guibin Zhang, Muxin Fu, Guancheng Wan, Miao Yu, Kun Wang, and Shuicheng Yan. 2025 a . https://arxiv.org/abs/2506.07398 G-memory: Tracing hierarchical memory for multi-agent systems . Preprint, arXiv:2506.07398

work page arXiv 2025

[63] [63]

Guibin Zhang, Haotian Ren, Chong Zhan, Zhenhong Zhou, Junhao Wang, He Zhu, Wangchunshu Zhou, and Shuicheng Yan. 2025 b . https://arxiv.org/abs/2512.18746 Memevolve: Meta-evolution of agent memory systems . Preprint, arXiv:2512.18746

work page internal anchor Pith review Pith/arXiv arXiv 2025

[64] [64]

Haozhen Zhang, Quanyu Long, Jianzhu Bao, Tao Feng, Weizhi Zhang, Haodong Yue, and Wenya Wang. 2026 a . https://arxiv.org/abs/2602.02474 Memskill: Learning and evolving memory skills for self-evolving agents . Preprint, arXiv:2602.02474

work page internal anchor Pith review Pith/arXiv arXiv 2026

[65] [65]

Shengtao Zhang, Jiaqian Wang, Ruiwen Zhou, Junwei Liao, Yuchen Feng, Weinan Zhang, Ying Wen, Zhiyu Li, Feiyu Xiong, Yutao Qi, Bo Tang, and Muning Wen. 2026 b . https://arxiv.org/abs/2601.03192 Memrl: Self-evolving agents via runtime reinforcement learning on episodic memory . Preprint, arXiv:2601.03192

work page internal anchor Pith review Pith/arXiv arXiv 2026

[66] [66]

Zeyu Zhang, Quanyu Dai, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Jieming Zhu, Zhenhua Dong, and Ji-Rong Wen. 2025 c . A survey on the memory mechanism of large language model-based agents. ACM Transactions on Information Systems, 43(6):1--47

2025

[67] [67]

Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-Jin Liu, and Gao Huang. 2024. Expel: Llm agents are experiential learners. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19632--19642

2024

[68] [68]

Huichi Zhou, Yihang Chen, Siyuan Guo, Xue Yan, Kin Hei Lee, Zihan Wang, Ka Yiu Lee, Guchun Zhang, Kun Shao, Linyi Yang, and Jun Wang. 2025. https://arxiv.org/abs/2508.16153 Memento: Fine-tuning llm agents without fine-tuning llms . Preprint, arXiv:2508.16153

work page arXiv 2025

[69] [69]

Adam Zweiger, Jyothish Pari, Han Guo, Ekin Akyürek, Yoon Kim, and Pulkit Agrawal. 2025. https://arxiv.org/abs/2506.10943 Self-adapting language models . Preprint, arXiv:2506.10943

work page arXiv 2025