Curated Skills boost LLM agent pass rates by 16.2pp on average across 86 tasks but self-generated Skills provide no benefit, with large variation by domain and some negative effects.
Of 322 candidate submissions from 105 contributors, 86 tasks passed all review stages and were included in the final benchmark (26.7% acceptance rate)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
Curated Skills boost LLM agent pass rates by 16.2pp on average across 86 tasks but self-generated Skills provide no benefit, with large variation by domain and some negative effects.