hub Canonical reference

Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

· 2026 · cs.AI · arXiv 2603.25158

Canonical reference. 89% of citing Pith papers cite this work as background.

26 Pith papers citing it

Background 89% of classified citations

open full Pith review browse 26 citing papers arXiv PDF

abstract

Large Language Model (LLM) agents increasingly rely on domain-specific skills, yet manually authoring such skills does not scale, and skills generated purely from parametric knowledge often miss critical operational pitfalls. We introduce Trace2Skill, a framework that consolidates broad execution trajectories in parallel into a unified skill directory through inductive reasoning over agent experience. Trace2Skill supports both deepening existing human-written skills and creating useful skills from weak LLM-generated drafts. Experiments demonstrate the effectiveness of Trace2Skill across diverse domains, including office workflows, math reasoning, and vision QA. Importantly, the evolved skills are not merely memorized artifacts of the trajectories used to create them: they often transfer across model scales, across model families, and to out-of-distribution settings. For example, skills evolved from Qwen3.5-35B trajectories improve a Qwen3.5-122B agent by up to $57.65$ percentage points on WikiTableQuestions. Further analyses show that Trace2Skill outperforms sequential skill editing and ReasoningBank-style retrieval memories, compresses recurring failures and workarounds into standard operating procedures (SoPs), and yields portable skills that can be reused without parameter updates or test-time retrieval.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 9

citation-polarity summary

background 8 unclear 1

representative citing papers

Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning

cs.CL · 2026-05-30 · unverdicted · novelty 7.0

SelSkill applies dual-granularity preference learning to selective skill-or-skip decisions, improving task success by 10.9 points and execution precision by 29.1 points on ALFWorld with Qwen3-8B.

Skills on the Fly: Test-Time Adaptive Skill Synthesis for LLM Agents

cs.CL · 2026-05-16 · unverdicted · novelty 7.0

SkillTTA synthesizes temporary task-specific skills from retrieved training trajectories to boost LLM agent Pass@1 scores on SpreadsheetBench and BigCodeBench without parameter updates.

SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents

cs.AI · 2026-04-19 · unverdicted · novelty 7.0

SkillFlow benchmark shows lifelong skill evolution yields modest gains for some models like Claude Opus 4.6 but limited or negative utility for others despite high skill usage.

From Multi-Agent to Single-Agent: When Is Skill Distillation Beneficial?

cs.AI · 2026-04-02 · unverdicted · novelty 7.0

Metric Freedom (F), quantified via Mantel test on output diversity and score variance, predicts when single-agent skill distillation from multi-agent systems will succeed, enabling up to 8x cost and 15x latency reductions across tested tasks.

Harnessing Agent Skills: Architectural Patterns and a Reference Architecture for Skill-Mediated LLM Agents

cs.AI · 2026-05-29 · unverdicted · novelty 6.0

Catalogs ten patterns and synthesizes a four-layer reference architecture for skill harnessing in LLM agents, evaluated via cross-instantiation on eight systems.

OptSkills: Learning Generalizable Optimization Skills from Problem Archetypes via Cluster-Based Distillation

cs.AI · 2026-05-28 · unverdicted · novelty 6.0

OptSkills clusters optimization problems by archetypes, distills workflow skills from successful trajectories, and achieves 68.27% micro-averaged accuracy on diverse benchmarks while outperforming DeepSeek-V3.2-Thinking by 4.53% on MIPLIB-NL.

SkillBrew: Multi-Objective Curation of Skill Banks for LLM Agents

cs.CL · 2026-05-28 · unverdicted · novelty 6.0

SkillBrew introduces a Pareto-aware multi-objective optimization framework with bi-level propose-then-verify to curate skill banks for LLM agents, evaluated on two public benchmarks.

Skill-Conditioned Gated Self-Distillation for LLM Reasoning

cs.CL · 2026-05-27 · unverdicted · novelty 6.0

SGSD retrieves skill-mistake pairs to build a multi-teacher pool, validates teacher polarity via a verifier, and applies a gated objective to distill useful signals, yielding 6.2% average gains over GRPO on math benchmarks with Qwen3-1.7B.

From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

A systematic study across five domains finds model-generated skills yield average gains but non-uniform negative transfer, with a meta-skill improving extraction quality.

Ratchet: A Minimal Hygiene Recipe for Self-Evolving LLM Agents

cs.AI · 2026-05-21 · conditional · novelty 6.0

Ratchet provides a minimal hygiene recipe for self-managing skill libraries in frozen LLM agents, delivering +0.328 rolling-mean pass@1 gain on MBPP+ hard-100 and +0.22 peak lift on SWE-bench Verified.

SkillEvolver: Skill Learning as a Meta-Skill

cs.AI · 2026-05-11 · unverdicted · novelty 6.0

A meta-skill authors and refines prose-and-code skills for agents by learning from post-deployment failures with an overfit audit, achieving 56.8% accuracy on SkillsBench tasks versus 43.6% for human-curated skills.

SkillRAE: Agent Skill-Based Context Compilation for Retrieval-Augmented Execution

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

SkillRAE organizes skills into a graph and compiles compact, grounded contexts for LLM agents, yielding 11.7% gains on SkillsBench over prior RAE methods.

SkillGen: Verified Inference-Time Agent Skill Synthesis

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

SkillGen synthesizes auditable skills from agent trajectories via contrastive induction on successes and failures, then verifies net performance impact by comparing outcomes with and without the skill on identical tasks.

ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation

cs.AI · 2026-04-26 · unverdicted · novelty 6.0

ClawTrace enables cost-aware LLM agent skill distillation by tracing per-step costs and generating preserve, prune, and repair patches, with ablations showing reduced regressions and prune rules transferring to cut costs by 32%.

Unsupervised Skill Discovery for Agentic Data Analysis

cs.AI · 2026-06-04 · unverdicted · novelty 5.0

DataCOPE uses verifier-guided contrastive distillation from agent trajectories to discover skills, yielding average gains of 9.71% on report-style and 32.30% on reasoning-style data analysis tasks across four model settings.

SkillSmith: Co-Evolving Skills and Tools for Self-Improving Agent Systems

cs.AI · 2026-05-31 · unverdicted · novelty 5.0

SkillSmith introduces a synergy-aware skill-tool co-evolution framework with atomic bundles, Lotka-Volterra-inspired interaction modeling, and anti-pattern recording that outperforms baselines on complex tasks.

SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution

cs.CL · 2026-05-18 · unverdicted · novelty 5.0

SkillsVote is a governance system for agent skills that profiles corpora, recommends via search, and gates updates on successful reusable outcomes, yielding benchmark gains without model changes.

Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning

cs.LG · 2026-05-11 · unverdicted · novelty 5.0 · 2 refs

SLIM dynamically optimizes the active external skill set in agentic RL via leave-one-skill-out marginal contribution estimates and lifecycle operations, delivering a 7.1% average gain over baselines on ALFWorld and SearchQA while showing some skills remain externally useful.

Ace-Skill: Bootstrapping Multimodal Agents with Prioritized and Clustered Evolution

cs.AI · 2026-05-09 · unverdicted · novelty 5.0

Ace-Skill boosts multimodal agent self-evolution via prioritized rollouts with lazy-decay tracking and semantic knowledge clustering, yielding up to 35% relative gains on tool-use benchmarks and zero-shot transfer to smaller models.

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

cs.AI · 2026-05-22

Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents

cs.AI · 2026-05-20 · 2 refs

Library Drift: Diagnosing and Fixing a Silent Failure Mode in Self-Evolving LLM Skill Libraries

cs.AI · 2026-05-19

Evidence Over Plans: Online Trajectory Verification for Skill Distillation

cs.AI · 2026-05-09

A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications

cs.IR · 2026-05-08 · 2 refs

citing papers explorer

Showing 1 of 1 citing paper after filters.

Ratchet: A Minimal Hygiene Recipe for Self-Evolving LLM Agents cs.AI · 2026-05-21 · conditional · none · ref 19 · internal anchor
Ratchet provides a minimal hygiene recipe for self-managing skill libraries in frozen LLM agents, delivering +0.328 rolling-mean pass@1 gain on MBPP+ hard-100 and +0.22 peak lift on SWE-bench Verified.

Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer