OpenSkillEval dynamically builds task instances across five application domains to evaluate 30 open skills with over 600 tests, finding that skill use depends heavily on model and framework and that many popular skills do not beat base agents.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents
OpenSkillEval dynamically builds task instances across five application domains to evaluate 30 open skills with over 600 tests, finding that skill use depends heavily on model and framework and that many popular skills do not beat base agents.