A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications
Pith reviewed 2026-05-20 23:12 UTC · model grok-4.3
The pith
Agent skills serve as reusable procedural artifacts that let LLM agents execute tasks reliably without repeated low-level reasoning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that skills, as reusable procedural artifacts coordinating tools, memory, and runtime context, form the key operational layer complementing agents' high-level reasoning, and organizes the literature around the four stages of representation, acquisition, retrieval, and evolution to advance scalability in LLM agent systems.
What carries the argument
The four-stage agent skill lifecycle consisting of representation, acquisition, retrieval, and evolution, which structures the review of techniques for creating and maintaining reusable skills.
If this is right
- Skills enable reliable execution across similar tasks by reusing proven procedures.
- Systems become more scalable as new tasks leverage existing skill libraries rather than building from scratch.
- Maintainability improves through structured updates and evolution of skills over time.
- Interoperability between different agent frameworks increases with standardized skill representations.
- Applications in complex workflows gain robustness from composable skill combinations.
Where Pith is reading between the lines
- Developers could build shared skill repositories that accelerate agent development across organizations.
- The lifecycle model might extend to non-LLM agents, such as those using traditional planning algorithms.
- Future research could explore automated verification methods for skill quality within this framework.
- Integration with memory systems could create self-improving skill collections.
Load-bearing premise
The diverse literature on LLM-based agents fits into the proposed four stages of the skill lifecycle without forcing unnatural categorizations or leaving out important work.
What would settle it
Identification of a substantial set of agent skill techniques or papers that cannot be classified into any of the four stages: representation, acquisition, retrieval, or evolution.
Figures
read the original abstract
Large language model (LLM)-based agents that reason, plan, and act through tools, memory, and structured interaction are emerging as a promising paradigm for automating complex workflows. Recent systems such as OpenClaw and Claude Code exemplify a broader shift from passive response generation to action-oriented task execution. Yet as agents move toward open-ended, real-world deployment, relying on from-scratch reasoning and low-level tool calls for every task become increasingly inefficient, error-prone, and hard to maintain. This survey examines this challenge through the lens of \emph{agent skills}, which we define as reusable procedural artifacts that coordinate tools, memory, and runtime context under task-specific constraints. Under this view, agents and skills play complementary roles: agents handle high-level reasoning and planning, while skills form the operational layer that enables reliable, reusable, and composable execution. Skills are therefore central to the scalability, robustness, and maintainability of modern agent systems. We organize the literature around four stages of the agent skill lifecycle -- representation, acquisition, retrieval, and evolution -- and review representative methods, ecosystem resources, and application settings across each stage. We conclude by discussing open challenges in quality control, interoperability, safe updating, and long-term capability management. All related resources, including research papers, open-source data, and projects, are collected for the community in \textcolor{blue}{https://github.com/JayLZhou/Awesome-Agent-Skills}.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper surveys LLM-based agent skills, defining them as reusable procedural artifacts that coordinate tools, memory, and runtime context under task-specific constraints. It argues that agents and skills play complementary roles, with skills forming the operational layer for reliable, reusable, and composable execution, thereby central to scalability, robustness, and maintainability. The literature is organized around four stages of the agent skill lifecycle—representation, acquisition, retrieval, and evolution—with reviews of representative methods, ecosystem resources, and applications in each stage. Open challenges in quality control, interoperability, safe updating, and long-term capability management are discussed, and all resources are collected in a GitHub repository.
Significance. If the four-stage taxonomy provides a non-forced and reasonably complete partition of the literature, the survey would offer a useful organizing framework for researchers building scalable agent systems. The explicit collection of papers, data, and projects in the linked GitHub repository is a concrete strength that enhances reproducibility and community utility beyond the textual review.
major comments (1)
- [Abstract and lifecycle organization section] The central organizational claim—that the existing literature partitions cleanly into the four stages of representation, acquisition, retrieval, and evolution without major omissions or forced categorizations—is load-bearing for the survey's practical value (see Abstract and the opening of the lifecycle section). The manuscript does not supply explicit selection criteria, coverage statistics, or a dedicated discussion of cross-stage methods (e.g., online skill refinement that interleaves acquisition and retrieval), leaving open the risk that the taxonomy imposes artificial boundaries as noted in the stress-test concern.
minor comments (2)
- [Review sections for each lifecycle stage] A summary table or figure listing representative methods per stage with key references would improve readability and allow readers to quickly assess coverage.
- [Conclusion] The GitHub repository link is mentioned but could be accompanied by a brief description of its structure and update policy in the main text.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the potential utility of the four-stage taxonomy and the GitHub repository. We address the major comment below and will revise the manuscript accordingly to strengthen the presentation of the taxonomy.
read point-by-point responses
-
Referee: [Abstract and lifecycle organization section] The central organizational claim—that the existing literature partitions cleanly into the four stages of representation, acquisition, retrieval, and evolution without major omissions or forced categorizations—is load-bearing for the survey's practical value (see Abstract and the opening of the lifecycle section). The manuscript does not supply explicit selection criteria, coverage statistics, or a dedicated discussion of cross-stage methods (e.g., online skill refinement that interleaves acquisition and retrieval), leaving open the risk that the taxonomy imposes artificial boundaries as noted in the stress-test concern.
Authors: We agree that the manuscript would benefit from greater transparency regarding how the taxonomy was constructed. The four stages reflect a natural lifecycle progression observed across the surveyed literature, rather than an imposed partition, but we acknowledge that explicit documentation of selection criteria and coverage would help readers evaluate completeness and potential boundary issues. In the revised version, we will add a dedicated subsection (likely in the introduction or at the start of the lifecycle organization section) that outlines the literature search methodology, inclusion criteria, time frame, and approximate coverage statistics (e.g., number of papers reviewed per stage). We will also include a new discussion paragraph or subsection addressing cross-stage methods, with concrete examples such as online skill refinement that interleaves acquisition and retrieval, and how such hybrid approaches are handled or noted within the taxonomy. This addition will explicitly discuss overlaps and mitigate concerns about artificial boundaries. revision: yes
Circularity Check
Survey organizes external literature without self-referential derivation
full rationale
This paper is a literature survey that defines agent skills and organizes existing external work into four lifecycle stages (representation, acquisition, retrieval, evolution) as an organizational framework. It reviews representative methods and resources from the broader literature rather than deriving new quantities, predictions, or results from fitted parameters, self-citations, or internal equations. The complementary roles of agents and skills are presented as a definitional viewpoint to motivate the survey structure, with no load-bearing steps that reduce claims to inputs by construction. No uniqueness theorems, ansatzes, or renamings of known results are invoked in a self-referential manner. The derivation is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We organize the literature around four stages of the agent skill lifecycle — representation, acquisition, retrieval, and evolution
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Language Models are Few-Shot Learners
T. B. Brownet al., “Language models are few-shot learners,” inAdvances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 1877–1901. [Online]. Available: https://arxiv.org/abs/2005.14165
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[2]
J. Achiamet al., “GPT-4 technical report,”arXiv preprint arXiv:2303.08774, 2023. [Online]. Available: https://arxiv.org/abs/ 2303.08774
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[3]
Training language models to follow instructions with human feedback
L. Ouyanget al., “Training language models to follow instructions with human feedback,” inAdvances in Neural Information Processing Systems, vol. 35. Curran Associates, Inc., 2022, pp. 27 730–27 744. [Online]. Available: https://arxiv.org/abs/2203.02155
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[4]
ReAct: Synergizing Reasoning and Acting in Language Models
S. Yaoet al., “ReAct: Synergizing reasoning and acting in language models,” inInternational Conference on Learning Representations (ICLR), 2023. [Online]. Available: https://arxiv.org/abs/2210.03629
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[5]
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
Y . Shenet al., “HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face,” inAdvances in Neural Information Processing Systems, vol. 36. Curran Associates, Inc., 2023. [Online]. Available: https://arxiv.org/abs/2303.17580
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[6]
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
S. Honget al., “MetaGPT: Meta programming for a multi- agent collaborative framework,” inInternational Conference on Learning Representations (ICLR), 2024. [Online]. Available: https: //arxiv.org/abs/2308.00352
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[7]
Openclaw — the open-source personal ai assistant and autonomous agent,
OpenClaw, “Openclaw — the open-source personal ai assistant and autonomous agent,” https://open-claw.org/, 2026, official website, ac- cessed April 21, 2026
work page 2026
-
[8]
Welcome - manus documentation,
Manus, “Welcome - manus documentation,” https://manus.im/docs, 2026, official documentation, accessed April 21, 2026
work page 2026
-
[9]
Anthropic, “Claude code overview,” https://docs.anthropic.com/en/ docs/claude-code/overview, 2026, official documentation, accessed April 21, 2026
work page 2026
-
[10]
Introducing the model context protocol,
——, “Introducing the model context protocol,” https://www.anthropic. com/news/model-context-protocol, 2024, anthropic Blog, November 2024
work page 2024
-
[11]
Function calling and other API updates,
OpenAI, “Function calling and other API updates,” https://openai.com/ blog/function-calling-and-other-api-updates, 2023, openAI Blog, June 2023
work page 2023
-
[12]
Voyager: An Open-Ended Embodied Agent with Large Language Models
G. Wanget al., “V oyager: An open-ended embodied agent with large language models,”arXiv preprint arXiv:2305.16291, 2023. [Online]. Available: https://arxiv.org/abs/2305.16291
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[13]
Large language models as tool makers,
T. Caiet al., “Large language models as tool makers,” inInternational Conference on Learning Representations (ICLR), 2024. [Online]. Available: https://arxiv.org/abs/2305.17126
-
[14]
CREATOR: Tool creation for disentangling abstract and concrete reasoning of large language models,
C. Qianet al., “CREATOR: Tool creation for disentangling abstract and concrete reasoning of large language models,” inFindings of the Association for Computational Linguistics: EMNLP 2023. Singapore: Association for Computational Linguistics, 2023, pp. 6922–6939. [Online]. Available: https://aclanthology.org/2023.findings-emnlp.462/
work page 2023
-
[15]
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
P. Lewiset al., “Retrieval-augmented generation for knowledge- intensive NLP tasks,” inAdvances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 9459–9474. [Online]. Available: https://arxiv.org/abs/2005.11401
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[16]
Dense passage retrieval for open-domain question answering,
V . Karpukhinet al., “Dense passage retrieval for open-domain question answering,” inProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics, 2020, pp. 6769–6781. [Online]. Available: https://aclanthology.org/2020.emnlp-main.550/
work page 2020
-
[17]
Y . Duet al., “AnyTool: Self-reflective, hierarchical agents for large- scale API calls,”arXiv preprint arXiv:2402.04253, 2024. [Online]. Available: https://arxiv.org/abs/2402.04253
-
[18]
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Q. Wuet al., “AutoGen: Enabling next-gen LLM applications via multi-agent conversation,”arXiv preprint arXiv:2308.08155, 2023. [Online]. Available: https://arxiv.org/abs/2308.08155
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[19]
Reflexion: Language Agents with Verbal Reinforcement Learning
N. Shinnet al., “Reflexion: Language agents with verbal reinforcement learning,” inAdvances in Neural Information Processing Systems, vol. 36. Curran Associates, Inc., 2023. [Online]. Available: https://arxiv.org/abs/2303.11366
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[21]
Toolformer: Language Models Can Teach Themselves to Use Tools
T. Schicket al., “Toolformer: Language models can teach themselves to use tools,” inAdvances in Neural Information Processing Systems, vol. 36. Curran Associates, Inc., 2023. [Online]. Available: https://arxiv.org/abs/2302.04761
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[22]
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Y . Qinet al., “ToolLLM: Facilitating large language models to master 16000+ real-world APIs,”arXiv preprint arXiv:2307.16789, 2023. [Online]. Available: https://arxiv.org/abs/2307.16789
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[24]
Buffer of thoughts: Thought-augmented reasoning with large language models,
X. Yanget al., “Buffer of thoughts: Thought-augmented reasoning with large language models,”Advances in Neural Information Processing Systems, 2024. [Online]. Available: https://arxiv.org/abs/2406.04271
-
[28]
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
M. Ahnet al., “Do as i can, not as i say: Grounding language in robotic affordances,” inConference on Robot Learning (CoRL), 2022. [Online]. Available: https://arxiv.org/abs/2204.01691
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[29]
Z. Wanget al., “Describe, explain, plan and select: Interactive planning with large language models enables open-world multi-task agents,” 2023. [Online]. Available: https://arxiv.org/abs/2302.01560
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[30]
Generative Agents: Interactive Simulacra of Human Behavior
J. S. Parket al., “Generative agents: Interactive simulacra of human behavior,” 2023. [Online]. Available: https://arxiv.org/abs/2304.03442 JOURNAL OF LATEX CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 19
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[31]
X. Zhuet al., “Ghost in the Minecraft: Generally capable agents for open-world environments via large language models with text-based knowledge and memory,”arXiv preprint arXiv:2305.17144, 2023. [Online]. Available: https://arxiv.org/abs/2305.17144
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[32]
Reasoning with language model is planning with world model,
S. Haoet al., “Reasoning with language model is planning with world model,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 8154–8173
work page 2023
-
[33]
W. Yaoet al., “Retroformer: Retrospective large language agents with policy gradient optimization,”arXiv preprint arXiv:2308.02151, 2023
-
[34]
MemGPT: Towards LLMs as Operating Systems
C. Packeret al., “Memgpt: Towards LLMs as operating systems,” arXiv preprint arXiv:2310.08560, 2023. [Online]. Available: https: //arxiv.org/abs/2310.08560
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[36]
arXiv preprint arXiv:2311.08719 , year=
[Online]. Available: https://arxiv.org/abs/2311.08719
-
[37]
Self-discover: Large language models self-compose reasoning structures,
P. Zhouet al., “Self-discover: Large language models self-compose reasoning structures,”Advances in Neural Information Processing Systems, vol. 37, pp. 126 032–126 058, 2024
work page 2024
-
[38]
Optimizing generative ai by backpropagating language model feedback,
M. Yuksekgonulet al., “Optimizing generative ai by backpropagating language model feedback,”Nature, vol. 639, no. 8055, pp. 609–616, 2025
work page 2025
-
[39]
Y . Yuet al., “Fincon: A synthesized LLM multi-agent system with conceptual verbal reinforcement for enhanced financial decision making,”arXiv preprint arXiv:2407.06567, 2024. [Online]. Available: https://arxiv.org/abs/2407.06567
-
[40]
M+: Extending memoryllm with scalable long-term memory,
Y . Wanget al., “M+: Extending memoryllm with scalable long-term memory,” 2025. [Online]. Available: https://arxiv.org/abs/2502.00592
-
[41]
Enhancing reasoning with collaboration and memory,
J. Michelmanet al., “Enhancing reasoning with collaboration and memory,”arXiv preprint arXiv:2503.05944, 2025
-
[42]
Nemori: Self-organizing agent memory inspired by cognitive science,
a. others, “Nemori: Self-organizing agent memory inspired by cognitive science,”arXiv preprint arXiv:2502.14828, 2025. [Online]. Available: https://arxiv.org/abs/2502.14828
-
[43]
Intrinsic memory agents: Heterogeneous multi-agent llm systems through structured contextual memory,
——, “Intrinsic memory agents: Heterogeneous multi-agent llm systems through structured contextual memory,”arXiv preprint arXiv:2506.19413, 2025. [Online]. Available: https://arxiv.org/abs/ 2506.19413
-
[44]
Skill-Pro: Learning Reusable Skills from Experience via Non-Parametric PPO for LLM Agents
Q. Miet al., “Procmem: Learning reusable procedural memory from experience via non-parametric ppo for llm agents,” 2026. [Online]. Available: https://arxiv.org/abs/2602.01869
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[45]
Skillcraft: Can llm agents learn to use tools skillfully?arXiv preprint arXiv:2603.00718, 2026
S. Chenet al., “Skillcraft: Can LLM agents learn to use tools skillfully?”arXiv preprint arXiv:2603.00718, 2026. [Online]. Available: https://arxiv.org/abs/2603.00718
-
[46]
Polyskill: Learning generalizable skills through polymorphic abstraction,
a. others, “Polyskill: Learning generalizable skills through polymorphic abstraction,”International Conference on Learning Representations,
-
[47]
Available: https://arxiv.org/abs/2510.15863
[Online]. Available: https://arxiv.org/abs/2510.15863
-
[49]
Cua-skill: Develop skills for computer using agent,
T. Chenet al., “Cua-skill: Develop skills for computer using agent,” arXiv preprint arXiv:2601.21123, 2026
-
[50]
Eureka: Human-Level Reward Design via Coding Large Language Models
Y . J. Maet al., “Eureka: Human-level reward design via coding large language models,” 2023. [Online]. Available: https://arxiv.org/ abs/2310.12931
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[51]
Ds-agent: Automated data science by empowering large language models with case-based reasoning,
X. Yueet al., “Ds-agent: Automated data science by empowering large language models with case-based reasoning,” 2024. [Online]. Available: https://arxiv.org/abs/2402.17453
-
[52]
Ldb: A large language model debugger via verifying runtime execution step-by-step,
X. Zhonget al., “Debug like a human: A large language model debugger via verifying runtime execution step-by-step,” 2024. [Online]. Available: https://arxiv.org/abs/2402.16906
-
[53]
Executable code actions elicit better LLM agents, 2024
X. Wanget al., “Executable code actions elicit better LLM agents,”arXiv preprint arXiv:2402.01030, 2024. [Online]. Available: https://arxiv.org/abs/2402.01030
-
[54]
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
J. Yanget al., “Swe-agent: Agent-computer interfaces enable automated software engineering,” 2024. [Online]. Available: https: //arxiv.org/abs/2405.15793
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[55]
Toolcoder: Teach code generation models to use api search tools,
K. Zhanget al., “Toolcoder: Teach code generation models to use api search tools,” 2023. [Online]. Available: https://arxiv.org/abs/2305. 04032
work page 2023
-
[56]
Evolving programmatic skill networks.arXiv preprint arXiv:2601.03509,
H. Shiet al., “Evolving programmatic skill networks,” 2026. [Online]. Available: https://arxiv.org/abs/2601.03509
-
[57]
Jarvis-1: Open-world multi-task agents with memory-augmented multimodal language models
Z. Wanget al., “Jarvis-1: Open-world multi-task agents with memory-augmented multimodal language models,”arXiv preprint arXiv:2311.05997, 2023. [Online]. Available: https://arxiv.org/abs/ 2311.05997
- [59]
-
[61]
Organizing, orchestrating, and benchmarking agent skills at ecosystem scale
H. Liet al., “Organizing, orchestrating, and benchmarking agent skills at ecosystem scale,”arXiv preprint arXiv:2603.02176, 2026. [Online]. Available: https://arxiv.org/abs/2603.02176
-
[62]
4 Athar Sefid, Prasenjit Mitra, and Lee Giles
J. Ruanet al., “Tptu: large language model-based ai agents for task planning and tool usage,”arXiv preprint arXiv:2308.03427, 2023
-
[63]
Agents thinking fast and slow: A talker- reasoner architecture,
K. Christakopoulouet al., “Agents thinking fast and slow: A talker- reasoner architecture,”arXiv preprint arXiv:2410.08328, 2024
-
[64]
a. others, “Llm-powered decentralized generative agents with adaptive hierarchical knowledge graph for cooperative planning,” arXiv preprint arXiv:2502.05453, 2025. [Online]. Available: https: //arxiv.org/abs/2502.05453
-
[65]
F. Wanget al., “Graphskill: Documentation-guided hierarchical retrieval-augmented coding for complex graph reasoning,” 2026. [Online]. Available: https://arxiv.org/abs/2603.06620
-
[66]
J. Qiuet al., “Alita: Generalist agent enabling scalable agentic rea- soning with minimal predefinition and maximal self-evolution,”arXiv preprint arXiv:2505.20286, 2025
-
[67]
Skillnet: Create, evaluate, and connect ai skills,
Y . Lianget al., “Skillnet: Create, evaluate, and connect ai skills,”
-
[68]
Available: https://arxiv.org/abs/2603.04448
[Online]. Available: https://arxiv.org/abs/2603.04448
-
[69]
Sok: Agentic skills – beyond tool use in llm agents,
Y . Jianget al., “Sok: Agentic skills – beyond tool use in llm agents,”
-
[70]
SoK: Agentic Skills -- Beyond Tool Use in LLM Agents
[Online]. Available: https://arxiv.org/abs/2602.20867
work page internal anchor Pith review Pith/arXiv arXiv
-
[71]
Skills are the new apps – now it’s time for skill os,
L. Chenet al., “Skills are the new apps – now it’s time for skill os,” 2026, preprints.org manuscript 202602.1096.v1. [Online]. Available: https://www.preprints.org/manuscript/202602.1096/v1
-
[72]
Agent hospital: A simulacrum of hospital with evolvable medical agents,
J. Liet al., “Agent hospital: A simulacrum of hospital with evolvable medical agents,”arXiv preprint arXiv:2405.02957, 2024. [Online]. Available: https://arxiv.org/abs/2405.02957
-
[73]
arXiv:2601.02163 [cs.AI] https: //arxiv.org/abs/2601.02163
C. Huet al., “Evermemos: A self-organizing memory operating system for structured long-horizon reasoning,” 2026. [Online]. Available: https://arxiv.org/abs/2601.02163
-
[74]
HyperMem: Hypergraph Memory for Long-Term Conversations
L. Yueet al., “Hypermem: Hypergraph memory for long-term conversations,” 2026, accepted to ACL 2026 Main. [Online]. Available: https://arxiv.org/abs/2604.08256
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[75]
G- memory: Tracing hierarchical memory for multi-agent systems, 2025
G. Zhanget al., “G-memory: Tracing hierarchical memory for multi- agent systems, 2025,”URL https://arxiv. org/abs/2506.07398
-
[76]
Agentevolver: Towards efficient self-evolving agent system,
a. others, “Agentevolver: Towards efficient self-evolving agent system,”arXiv preprint arXiv:2511.10395, 2025. [Online]. Available: https://arxiv.org/abs/2511.10395
-
[77]
Building self-evolving agents via experience-driven lifelong learning: A framework and benchmark,
Y . Caiet al., “Building self-evolving agents via experience-driven lifelong learning: A framework and benchmark,” 2025. [Online]. Available: https://arxiv.org/abs/2508.19005
-
[78]
L. Qiuet al., “Autorefine: From trajectories to reusable expertise for continual llm agent refinement,” 2026. [Online]. Available: https://arxiv.org/abs/2601.22758
-
[79]
Karlsson, Bo An, and Zongqing Lu
W. Tanet al., “Cradle: Empowering foundation agents towards general computer control,”arXiv preprint arXiv:2403.03186, 2024. [Online]. Available: https://arxiv.org/abs/2403.03186
-
[80]
AppAgent: Multimodal Agents as Smartphone Users
C. Zhanget al., “Appagent: Multimodal agents as smartphone users,”arXiv preprint arXiv:2312.13771, 2023. [Online]. Available: https://arxiv.org/abs/2312.13771
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[81]
Y . Fuet al., “Autoguide: Automated generation and selection of context-aware guidelines for large language model agents,” arXiv preprint arXiv:2403.08978, 2024. [Online]. Available: https: //arxiv.org/abs/2403.08978
-
[82]
WebArena: A Realistic Web Environment for Building Autonomous Agents
S. Zhouet al., “WebArena: A realistic web environment for building autonomous agents,” inInternational Conference on Learning Representations (ICLR), 2024. [Online]. Available: https: //arxiv.org/abs/2307.13854
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[83]
Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG
Y . Sunet al., “Don’t retrieve, navigate: Distilling enterprise knowledge into navigable agent skills for qa and rag,”arXiv preprint arXiv:2604.14572, Apr. 2026. [Online]. Available: https: //arxiv.org/abs/2604.14572
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[84]
Agentdistill: Training-free agent distillation with gener- alizable mcp boxes,
J. Qiuet al., “Agentdistill: Training-free agent distillation with gener- alizable mcp boxes,”arXiv preprint arXiv:2506.14728, 2025
-
[85]
Reinforcement Learning for Self-Improving Agent with Skill Library
J. Wanget al., “Reinforcement learning for self-improving agent with skill library,”arXiv preprint arXiv:2512.17102, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[86]
Autoskill: Experience-driven lifelong learning via skill self-evolution, 2026
Y . Yanget al., “Autoskill: Experience-driven lifelong learning via skill self-evolution,” 2026. [Online]. Available: https://arxiv.org/abs/ 2603.01145
-
[87]
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents
H. Zhanget al., “Memskill: Learning and evolving memory skills for self-evolving agents,” 2026. [Online]. Available: https: //arxiv.org/abs/2602.02474
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[88]
ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
S. Ouyanget al., “Reasoningbank: Scaling agent self-evolving with reasoning memory,” 2025. [Online]. Available: https://arxiv.org/abs/ 2509.25140 JOURNAL OF LATEX CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 20
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[89]
SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills
B. Zhenget al., “Skillweaver: Web agents can self-improve by discovering and honing skills,” 2025. [Online]. Available: https://arxiv.org/abs/2504.07079
work page internal anchor Pith review Pith/arXiv arXiv 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.