SkillOps: Managing LLM Agent Skill Libraries as Self-Maintaining Software Ecosystems

· 2026 · cs.SE · arXiv 2605.13716

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Large language model agents increasingly rely on skill libraries for multi-step tasks, yet these libraries can accumulate persistent defects as skills are added, reused, patched, and linked to changing dependencies. We call this failure mode skill technical debt: library-level defects that may not break a single skill locally but can harm future retrieval, composition, and execution. Existing skill-based agents mainly focus on task-time retrieval, planning, and repair, while library-time maintenance remains underexplored. We propose SkillOps, a method-agnostic plug-in framework for maintaining skill libraries. SkillOps represents each skill as a typed Skill Contract (P, O, A, V, F), organizes skills with a Hierarchical Skill Ecosystem Graph, and diagnoses library health across utility, compatibility, risk, and validation dimensions. Given a raw skill library, SkillOps produces a maintained library that can be used by existing retrieval or planning agents without changing their internal code. On ALFWorld, SkillOps achieves 79.5 percent task success as a standalone agent, outperforming the strongest baseline by 8.8 percentage points with no additional task-time large language model calls. As a plug-in layer, it improves retrieval-heavy baselines by 0.68 to 2.90 percentage points. The current rule-based maintenance implementation uses nearly zero library-time large language model calls or tokens, showing that skill-library maintenance can be added as a low-overhead architectural layer.

representative citing papers

From Registry to Repository: How AI Agent Skills Are Written, Adapted, and Maintained

cs.SE · 2026-07-01 · unverdicted · novelty 7.0

Empirical study of 41k+ AI agent skills finds reuse is mostly one-time verbatim copying with 53% never modified afterward and maintenance focused on additive local adaptations.

Skill-as-Pseudocode: Refactoring Skill Libraries to Pseudocode for LLM Agents

cs.PL · 2026-05-27 · unverdicted · novelty 6.0

SaP converts prose skills to typed pseudocode via clustering and deterministic verification, yielding 82 vs 47 wins on ALFWorld unseen split versus Graph-of-Skills baseline.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Skill-as-Pseudocode: Refactoring Skill Libraries to Pseudocode for LLM Agents cs.PL · 2026-05-27 · unverdicted · none · ref 4 · internal anchor
SaP converts prose skills to typed pseudocode via clustering and deterministic verification, yielding 82 vs 47 wins on ALFWorld unseen split versus Graph-of-Skills baseline.

SkillOps: Managing LLM Agent Skill Libraries as Self-Maintaining Software Ecosystems

fields

years

verdicts

representative citing papers

citing papers explorer