EXG: Self-Evolving Agents with Experience Graphs
Pith reviewed 2026-05-19 22:14 UTC · model grok-4.3
pith:7JHGNWME Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{7JHGNWME}
Prints a linked pith:7JHGNWME badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
The pith
EXG turns agent successes and failures into a connected graph for instant reuse across tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
EXG is the first experience graph designed for self-evolving agents, supporting both online, real-time graph growth during execution for immediate cross-task experience reuse, and offline reuse of a consolidated experience graph as an external memory module. This design also enables EXG to serve as a plug-and-play component for existing self-evolving agents, organizing prior experience into a unified experience graph and improving both solution quality and resource efficiency as deployment progresses.
What carries the argument
The experience graph, which explicitly organizes accumulated successes and failures into a structured, relational representation for real-time growth and consolidated reuse.
If this is right
- Agents gain immediate cross-task reuse from experiences gathered during execution.
- A consolidated graph can be used offline as external memory to boost later performance.
- Existing self-evolving agents can adopt the graph as a plug-in to organize their prior experience.
- Overall performance-efficiency trade-offs improve compared with ad hoc reflection or fragmented memory.
Where Pith is reading between the lines
- The graph approach could be tested in domains beyond code and reasoning, such as tool-use or planning agents, to check whether relational linking scales to longer task chains.
- If the structure keeps overhead low, it might reduce reliance on periodic retraining by letting agents carry forward lessons in a compact, queryable form.
- Connections to graph-based memory systems in other AI work could be explored to see whether the same relational pattern supports transfer between entirely different agent types.
Load-bearing premise
Successes and failures accumulated during agent execution can be effectively captured and related in a graph structure that enables immediate and transferable reuse without fragmentation or high overhead.
What would settle it
A direct comparison on the same code generation and reasoning benchmarks showing that agents equipped with the experience graph produce no measurable gains in solution quality or resource efficiency over reflection-only or unstructured-memory baselines.
Figures
read the original abstract
Large language model (LLM)-based agents have demonstrated strong capabilities in complex reasoning and problem solving through multi-step interactions, yet most deployed agents remain behaviorally static, with knowledge acquired during execution rarely translating into systematic improvement over time. In response, a growing line of work on self-evolving agents explores how agents can improve through experience during deployment, but most existing approaches either rely on ad hoc reflection limited to single-task correction or adopt unstructured memory that accumulates fragmented experience with delayed usability. To address this limitation, we introduce EXG, an experience graph framework for self-evolving agents that explicitly organizes accumulated successes and failures into a structured, relational representation. EXG is the first experience graph designed for self-evolving agents, supporting both online, real-time graph growth during execution for immediate cross-task experience reuse, and offline reuse of a consolidated experience graph as an external memory module. This design also enables EXG to serve as a plug-and-play component for existing self-evolving agents, organizing prior experience into a unified experience graph and improving both solution quality and resource efficiency as deployment progresses. Extensive experiments across code generation and reasoning benchmarks show that EXG attains more favorable performance-efficiency trade-offs than reflection- and memory-based baselines in both online and offline evaluations. Our results suggest that structuring experience as a graph provides a principled foundation for scalable and transferable self-evolving agent behavior.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces EXG, an experience graph framework for self-evolving LLM-based agents. It organizes accumulated successes and failures into a structured relational graph that supports online real-time growth during execution for immediate cross-task reuse, as well as offline consolidation and reuse as an external memory module. EXG is presented as a plug-and-play component that can be integrated with existing self-evolving agents to improve solution quality and resource efficiency over time. The authors report extensive experiments on code generation and reasoning benchmarks demonstrating more favorable performance-efficiency trade-offs relative to reflection- and memory-based baselines in both online and offline settings.
Significance. If the experimental results hold, the work provides a concrete, graph-structured mechanism for experience reuse that directly targets fragmentation and delayed usability issues in current self-evolving agent designs. The dual support for online growth and offline external-memory use, combined with the plug-and-play integration claim, could offer a reusable primitive for building more adaptive agents. The emphasis on efficiency alongside performance is a practical strength that distinguishes the contribution from purely reflective or flat-memory approaches.
minor comments (3)
- The abstract asserts 'extensive experiments' with favorable trade-offs but does not preview key metrics, baselines, or dataset details; adding a concise summary of the evaluation protocol in the abstract or introduction would improve accessibility.
- Clarify the precise node and edge definitions for experience fragments early in the manuscript (ideally with a small illustrative example) to make the graph-construction rules immediately understandable before the algorithmic description.
- Ensure that the experimental section includes explicit statements of statistical significance or variance across runs for the reported performance-efficiency trade-offs, as this is necessary to support the cross-baseline claims.
Simulated Author's Rebuttal
We thank the referee for the positive summary, recognition of the significance of the EXG framework, and recommendation for minor revision. We are pleased that the dual online/offline design and plug-and-play aspects were viewed favorably.
Circularity Check
No significant circularity in derivation chain
full rationale
The paper presents EXG as a new architectural framework for structuring agent experience into a relational graph, with design choices for online real-time growth and offline consolidation explicitly described as implementation decisions rather than derived predictions. No equations, parameter fits, or self-citations are invoked to force the core claims; the abstract and positioning against ad-hoc reflection rely on stated motivations and reported benchmark outcomes instead of reducing to input definitions or prior author work by construction. The framework is introduced with plug-and-play integration details that remain testable independently of any self-referential loop.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM-based agents accumulate usable experience from successes and failures that can be relationally structured for reuse
invented entities (1)
-
Experience Graph (EXG)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
EXG abstracts each completed attempt within a trajectory into a case... golden cases... warning cases... experience graph G=(V,E) with case nodes, task anchor nodes, contain/similarity/correction edges
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
online self-evolving loop... offline reuse of a consolidated experience graph as an external memory module
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Qingyao Ai, Yichen Tang, Changyue Wang, Jianming Long, Weihang Su, and Yiqun Liu. 2025. MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems. arXiv:2510.17281 [cs.LG] https://arxiv.org/abs/2510. 17281
work page internal anchor Pith review Pith/arXiv arXiv 2025
- [2]
-
[3]
Zouying Cao, Jiaji Deng, Li Yu, Weikang Zhou, Zhaoyang Liu, Bolin Ding, and Hai Zhao. 2025. Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution. arXiv:2512.10696 [cs.AI] https://arxiv.org/abs/2512.10696
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[4]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian...
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[5]
Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu, Shumin Deng, Yunzhi Yao, Mengru Wang, Shuofei Qiao, Huajun Chen, and Ningyu Zhang
-
[6]
arXiv:2510.18866 [cs.CL] https://arxiv.org/abs/2510.18866
LightMem: Lightweight and Efficient Memory-Augmented Generation. arXiv:2510.18866 [cs.CL] https://arxiv.org/abs/2510.18866
-
[7]
Jackson Hassell, Dan Zhang, Hannah Kim, Tom Mitchell, and Estevam Hr- uschka. 2025. Learning from Supervision with Semantic and Episodic Mem- ory: A Reflective Approach to Agent Adaptation. arXiv:2510.19897 [cs.CL] https://arxiv.org/abs/2510.19897
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[8]
Chuanrui Hu, Xingze Gao, Zuyi Zhou, Dannong Xu, Yi Bai, Xintong Li, Hui Zhang, Tong Li, Chong Zhang, Lidong Bing, and Yafeng Deng. 2026. EverMemOS: A Self- Organizing Memory Operating System for Structured Long-Horizon Reasoning. arXiv:2601.02163 [cs.AI] https://arxiv.org/abs/2601.02163
-
[9]
Xuechen Liang, Meiling Tao, Yinghui Xia, Jianhui Wang, Kun Li, Yijin Wang, Yangfan He, Jingsong Yang, Tianyu Shi, Yuantao Wang, Miao Zhang, and Xueqian Wang. 2025. SAGE: Self-evolving Agents with Reflective and Memory-augmented Abilities.Neurocomput.647, C (Sept. 2025), 12 pages. doi:10.1016/j.neucom.2025. 130470
-
[10]
Jiaye Lin, Yifu Guo, Yuzhen Han, Sen Hu, Ziyi Ni, Licheng Wang, Mingguang Chen, Hongzhang Liu, Ronghao Chen, Yangfan He, Daxin Jiang, Binxing Jiao, Chen Hu, and Huacan Wang. 2025. SE-Agent: Self-Evolution Trajectory Optimiza- tion in Multi-Step Reasoning with LLM-Based Agents. arXiv:2508.02085 [cs.AI] https://arxiv.org/abs/2508.02085
-
[11]
Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and LINGMING ZHANG
-
[12]
InAdvances in Neural Information Processing Systems, A
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation. InAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Glober- son, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 21558–21572. https://proceedings.neurips.cc/paper_files/paper/2023/file/ 43e...
work page 2023
-
[13]
Yitao Liu, Chenglei Si, Karthik R Narasimhan, and Shunyu Yao. 2025. Contextual Experience Replay for Self-Improvement of Language Agents. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Moham- mad Taher Pilehvar (Eds.). Association for Co...
- [14]
-
[15]
Hongliang Lu, Yuhang Wen, Pengyu Cheng, Ruijin Ding, Jiaqi Guo, Haotian Xu, Chutian Wang, Haonan Chen, Xiaoxi Jiang, and Guanjun Jiang. 2025. Search Self-play: Pushing the Frontier of Agent Capability without Supervision. arXiv:2510.18821 [cs.LG] https://arxiv.org/abs/2510.18821
work page internal anchor Pith review arXiv 2025
-
[16]
Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, and Peter Clark. 2023. Self-Refine: Iterative Refinement with Self-Feedback. InAdvances in Neural Information Processi...
work page 2023
-
[17]
Siru Ouyang, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long T. Le, Samira Daruki, Xiangru Tang, Vishy Tirumalashetty, George Lee, Mahsan Rofouei, Hangfei Lin, Jiawei Han, Chen-Yu Lee, and Tomas Pfister
-
[18]
ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory. arXiv:2509.25140 [cs.AI] https://arxiv.org/abs/2509.25140
work page internal anchor Pith review Pith/arXiv arXiv
-
[19]
MemGPT: Towards LLMs as Operating Systems
Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. 2024. MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560 [cs.AI] https://arxiv.org/abs/2310.08560
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[20]
Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. Generative Agents: Interactive Simulacra of Human Behavior. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(San Francisco, CA, USA)(UIST ’23). Association for Computing Machinery, New York, NY, USA, ...
- [21]
-
[22]
Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: language agents with verbal reinforcement learn- ing. InAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Asso- ciates, Inc., 8634–8652. https://proceedings.neurips...
work page 2023
-
[23]
Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, and Ashish Sabharwal
-
[24]
MuSiQue: Multihop Questions via Single-hop Question Composition. Transactions of the Association for Computational Linguistics10 (2022), 539–554. doi:10.1162/tacl_a_00475
-
[25]
Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. 2023. Voyager: An Open-Ended Embodied Agent with Large Language Models. arXiv:2305.16291 [cs.AI] https://arxiv.org/ abs/2305.16291
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[26]
Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. 2020. MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre- Trained Transformers. InAdvances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 5776–5788. https://proc...
work page 2020
-
[27]
Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. 2023. Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv:2203.11171 [cs.CL] https://arxiv.org/abs/2203.11171
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[28]
Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, and Graham Neubig. 2024. Agent Workflow Memory. arXiv:2409.07429 [cs.CL] https://arxiv.org/abs/2409.07429
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[29]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-thought prompting elicits reasoning in large language models. InProceedings of the 36th International Conference on Neural Information Processing Systems(New Orleans, LA, USA) (NIPS ’22). Curran Associates Inc., Red Hook, NY...
work page 2022
-
[30]
Rubin Wei, Jiaqi Cao, Jiarui Wang, Jushi Kai, Qipeng Guo, Bowen Zhou, and Zhouhan Lin. 2025. MLP Memory: A Retriever-Pretrained Memory for Large Language Models. arXiv:2508.01832 [cs.CL] https://arxiv.org/abs/2508.01832 , , Yuxin Jin, Siyuan Zhang, Hanchen Wang, Lu Qin, Ying Zhang, and Wenjie Zhang
-
[31]
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory
Tianxin Wei, Noveen Sachdeva, Benjamin Coleman, Zhankui He, Yuanchen Bei, Xuying Ning, Mengting Ai, Yunzhe Li, Jingrui He, Ed H. Chi, Chi Wang, Shuo Chen, Fernando Pereira, Wang-Cheng Kang, and Derek Zhiyuan Cheng. 2025. Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory. arXiv:2511.20857 [cs.CL] https://arxiv.org/abs/2511.20857
work page internal anchor Pith review Pith/arXiv arXiv 2025
- [32]
-
[33]
Rong Wu, Xiaoman Wang, Jianbiao Mei, Pinlong Cai, Daocheng Fu, Cheng Yang, Licheng Wen, Xuemeng Yang, Yufan Shen, Yuxin Wang, and Botian Shi. 2025. EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle. arXiv:2510.16079 [cs.CL] https://arxiv.org/abs/2510.16079
work page internal anchor Pith review Pith/arXiv arXiv 2025
- [34]
- [35]
-
[36]
Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, and Yongfeng Zhang
-
[37]
A-MEM: Agentic Memory for LLM Agents
A-MEM: Agentic Memory for LLM Agents. arXiv:2502.12110 [cs.CL] https://arxiv.org/abs/2502.12110
work page internal anchor Pith review Pith/arXiv arXiv
-
[38]
An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, Chujie Zheng, Dayiheng Liu, Fan Zhou, Fei Huang, Feng Hu, Hao Ge, Haoran Wei, Huan Lin, Jialong Tang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jing Zhou, Jingren Zhou, Junyang Lin, Kai Dang, Keqin Bao, Kexin Yang, ...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[39]
Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V Le, Denny Zhou, and Xinyun Chen. 2024. Large Language Models as Op- timizers. InInternational Conference on Learning Representations, B. Kim, Y. Yue, S. Chaudhuri, K. Fragkiadaki, M. Khan, and Y. Sun (Eds.), Vol. 2024. 12028–12068. https://proceedings.iclr.cc/paper_files/paper/2024/file/ 3339f19c5...
work page 2024
-
[40]
Cheng Yang, Xuemeng Yang, Licheng Wen, Daocheng Fu, Jianbiao Mei, Rong Wu, Pinlong Cai, Yufan Shen, Nianchen Deng, Botian Shi, Yu Qiao, and Haifeng Li. 2025. Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks. arXiv:2510.08002 [cs.CL] https://arxiv.org/abs/2510.08002
-
[41]
Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. 2018. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun’ichi Tsujii (...
-
[42]
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629 [cs.CL] https://arxiv.org/abs/2210.03629
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
Yi Yu, Liuyi Yao, Yuexiang Xie, Qingquan Tan, Jiaqi Feng, Yaliang Li, and Libing Wu. 2026. Agentic Memory: Learning Unified Long-Term and Short-Term Mem- ory Management for Large Language Model Agents. arXiv:2601.01885 [cs.CL] https://arxiv.org/abs/2601.01885
work page internal anchor Pith review Pith/arXiv arXiv 2026
- [44]
-
[45]
Yunpeng Zhai, Shuchang Tao, Cheng Chen, Anni Zou, Ziqian Chen, Qingxu Fu, Shinji Mai, Li Yu, Jiaji Deng, Zouying Cao, Zhaoyang Liu, Bolin Ding, and Jingren Zhou. 2025. AgentEvolver: Towards Efficient Self-Evolving Agent System. arXiv:2511.10395 [cs.LG] https://arxiv.org/abs/2511.10395
- [46]
-
[47]
Guibin Zhang, Haotian Ren, Chong Zhan, Zhenhong Zhou, Junhao Wang, He Zhu, Wangchunshu Zhou, and Shuicheng Yan. 2025. MemEvolve: Meta-Evolution of Agent Memory Systems. arXiv:2512.18746 [cs.CL] https://arxiv.org/abs/2512. 18746
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[48]
Kai Zhang, Xiangchao Chen, Bo Liu, Tianci Xue, Zeyi Liao, Zhihan Liu, Xiyao Wang, Yuting Ning, Zhaorun Chen, Xiaohan Fu, Jian Xie, Yuxuan Sun, Boyu Gou, Qi Qi, Zihang Meng, Jianwei Yang, Ning Zhang, Xian Li, Ashish Shah, Dat Huynh, Hengduo Li, Zi Yang, Sara Cao, Lawrence Jang, Shuyan Zhou, Jiacheng Zhu, Huan Sun, Jason Weston, Yu Su, and Yifan Wu. 2025. A...
-
[49]
Qizheng Zhang, Changran Hu, Shubhangi Upasani, Boyuan Ma, Fenglu Hong, Vamsidhar Kamanuru, Jay Rainton, Chen Wu, Mengmeng Ji, Hanchen Li, Urmish Thakker, James Zou, and Kunle Olukotun. 2025. Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models. arXiv:2510.04618 [cs.LG] https://arxiv.org/abs/2510.04618
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[50]
Shuyu Zhang, Yujie Liu, Xinru Wang, Cheng Zhang, Yanmin Zhu, and Bin Li
-
[51]
DarwinTOD: LLM-driven Lifelong Self-evolution for Task-oriented Dialog Systems
DarwinTOD: LLM Driven Lifelong Self Evolution for Task Oriented Dialog Systems. arXiv:2601.07248 [cs.MA] https://arxiv.org/abs/2601.07248
work page internal anchor Pith review Pith/arXiv arXiv
-
[52]
Shengtao Zhang, Jiaqian Wang, Ruiwen Zhou, Junwei Liao, Yuchen Feng, Weinan Zhang, Ying Wen, Zhiyu Li, Feiyu Xiong, Yutao Qi, Bo Tang, and Muning Wen
-
[53]
MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory
MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory. arXiv:2601.03192 [cs.CL] https://arxiv.org/abs/2601.03192
work page internal anchor Pith review Pith/arXiv arXiv
-
[54]
Zeyu Zhang, Quanyu Dai, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Jieming Zhu, Zhenhua Dong, and Ji-Rong Wen. 2025. A Survey on the Memory Mechanism of Large Language Model-based Agents.ACM Trans. Inf. Syst.43, 6, Article 155 (Sept. 2025), 47 pages. doi:10.1145/3748302
-
[55]
Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-Jin Liu, and Gao Huang. 2024. ExpeL: LLM agents are experiential learners. InProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Confer- ence on Innovative Applications of Artificial Intelligence and Fourteenth Sympo- sium on Educational Advances in Artifici...
-
[56]
Longtao Zheng, Rundong Wang, Xinrun Wang, and Bo An. 2024. Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control. arXiv:2306.07863 [cs.AI] https://arxiv.org/abs/2306.07863 EXG: Self-Evolving Agents with Experience Graphs , , A Algorithmic Details Algorithm 2 details the procedure for constructing structured ex- perience hints from a r...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.