From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

· 2026 · cs.AI · arXiv 2605.23899

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Language agents increasingly improve by reusing \emph{skills} -- structured procedural artifacts distilled from past experience. In particular, \emph{domain-level} and \emph{model-generated} skills are especially promising. They offer fast adaptation within a domain by encoding domain-specific recurring procedures, and they scale beyond labor-intensive hand-crafting. However, while extraction methods continue to proliferate, understanding remains limited, with no comprehensive study spanning the full skill lifecycle -- \textbf{experience generation}, \textbf{skill extraction}, and \textbf{skill consumption} -- to ask whether such skills actually work, when they work, and what makes them succeed or fail. To close this gap, we build a utility-grounded evaluation framework that provides systematic experimental results across extractors and target agents, covering five diverse agentic task domains. We find that model-generated skills are beneficial on average but exhibit non-trivial negative transfer, and that neither extractors nor targets behave uniformly. A model can be a strong extractor yet a weak consumer, or vice versa, with skill utility independent of model scale or baseline task strength. To explain these patterns, we then dissect each lifecycle stage in depth, analyzing how experience composition shapes skill quality, what properties characterize useful skills, and how the same skill transfers across different consumers. Finally, we translate these findings into a concrete \emph{meta-skill} that guides skill extraction toward the features tied to actual utility, which consistently improves skill quality across domains and substantially reduces negative transfer.

representative citing papers

Workflow-to-Skill: Skill Creation via Routing-Workflow-Semantics-Attachments Decomposition

cs.AI · 2026-06-05 · unverdicted · novelty 6.0

W2S framework with RWSA decomposition converts heterogeneous traces into Skills and improves behavioral replay consistency by 10.5% over summarization baselines on 70 Skills.

SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories

cs.CL · 2026-05-31 · unverdicted · novelty 6.0

SkillAdaptor introduces step-level failure attribution and targeted skill updates for LLM agents, yielding performance gains on WebShop, PinchBench, and Claw-Eval benchmarks.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Workflow-to-Skill: Skill Creation via Routing-Workflow-Semantics-Attachments Decomposition cs.AI · 2026-06-05 · unverdicted · none · ref 8 · internal anchor
W2S framework with RWSA decomposition converts heterogeneous traces into Skills and improves behavioral replay consistency by 10.5% over summarization baselines on 70 Skills.

From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

fields

years

verdicts

representative citing papers

citing papers explorer