A survey that taxonomizes agent skills for LLM-based agents across representation, acquisition, retrieval, and evolution stages while reviewing methods, resources, and open challenges.
Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Retrieval-Augmented Generation (RAG) grounds LLM responses in external evidence but treats the model as a passive consumer of search results, with no view of how the corpus is organized or what it has not yet seen. We present Corpus2Skill, which distills a document corpus offline into a hierarchical skill directory and lets an LLM agent navigate it at serve time, drilling from a bird's-eye view through progressively finer summaries down to documents, and backtracking when a branch is unproductive. On an enterprise customer-support benchmark, Corpus2Skill improves both answer quality and grounding over single-shot dense, hybrid, hierarchical-retrieval, and agentic RAG baselines at a moderate cost tradeoff. A ten-subset generalization study further shows that corpus navigation is not a universal replacement for retrieval: it consistently helps on single-domain corpora with a recoverable topical taxonomy, but flat retrieval remains preferable on open-domain factoid pools or homogeneous-tabular corpora that defeat top-level clustering. We characterize this scope distinction and discuss it as a design guideline for knowledge-grounded systems. Code is available at https://github.com/dukesun99/Corpus2Skill.
citation-role summary
citation-polarity summary
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications
A survey that taxonomizes agent skills for LLM-based agents across representation, acquisition, retrieval, and evolution stages while reviewing methods, resources, and open challenges.