SkillGen: Verified Inference-Time Agent Skill Synthesis

· 2026 · cs.LG · arXiv 2605.10999

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Skills are a promising way to improve LLM agent capabilities without retraining, while keeping the added procedure reusable and controllable. However, high-quality skills are still largely written by hand. We introduce SkillGen, a multi-agent framework that synthesizes a single auditable skill from trajectories generated by a base agent. The output is a human-readable artifact that can be inspected before use. Rather than merely summarizing trajectories, SkillGen leverages contrastive induction over both successful and failed trajectories to identify reusable success patterns, recurring failure modes, and behaviors that appear in nearby successes but are missing from failures. SkillGen then generates candidate skills and iteratively refines the skill. A key novelty in SkillGen is that we model agent skills as interventions to empirically verify the net effect of skills on the overall performance. Specifically, we compare outcomes on the same instances with and without the skill, so that we account for both repairs (cases where the skill fixes a baseline failure) and regressions (cases where the skill breaks a baseline success). Across a broad range of agents and datasets, SkillGen consistently improves held-out performance, outperforms existing skill-generation baselines, and produces skills that transfer across models.

representative citing papers

OpenSkill: Open-World Self-Evolution for LLM Agents

cs.AI · 2026-06-04 · unverdicted · novelty 5.0

OpenSkill bootstraps LLM agent self-evolution by pulling grounded knowledge and anchors from open-world sources, synthesizing transferable skills, and refining them on self-generated virtual tasks, achieving top benchmark pass rates without supervision.

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

cs.AI · 2026-05-26 · unverdicted · novelty 5.0

MUSE-Autoskill introduces a skill-centric framework for self-evolving LLM agents through a unified lifecycle of skill creation, memory, management, evaluation, and refinement.

citing papers explorer

Showing 2 of 2 citing papers after filters.

OpenSkill: Open-World Self-Evolution for LLM Agents cs.AI · 2026-06-04 · unverdicted · none · ref 37 · internal anchor
OpenSkill bootstraps LLM agent self-evolution by pulling grounded knowledge and anchors from open-world sources, synthesizing transferable skills, and refining them on self-generated virtual tasks, achieving top benchmark pass rates without supervision.
MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation cs.AI · 2026-05-26 · unverdicted · none · ref 14 · internal anchor
MUSE-Autoskill introduces a skill-centric framework for self-evolving LLM agents through a unified lifecycle of skill creation, memory, management, evaluation, and refinement.

SkillGen: Verified Inference-Time Agent Skill Synthesis

fields

years

verdicts

representative citing papers

citing papers explorer