SkillPager retrieves typed semantic nodes from skill documents via MMR to reach 78.89% LLM-judged sufficiency with 47% fewer tokens than full documents on a 395-skill benchmark.
ObjectGraph: From Document Injection to Knowledge Traversal -- A Native File Format for the Agentic Era
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Every document format in existence was designed for a human reader moving linearly through text. Autonomous LLM agents do not read - they retrieve. This fundamental mismatch forces agents to inject entire documents into their context window, wasting tokens on irrelevant content, compounding state across multi-turn loops, and broadcasting information indiscriminately across agent roles. We argue this is not a prompt engineering problem, not a retrieval problem, and not a compression problem: it is a format problem. We introduce OBJECTGRAPH (.og), a file format that reconceives the document as a typed, directed knowledge graph to be traversed rather than a string to be injected. OBJECTGRAPH is a strict superset of Markdown - every .md file is a valid .og file - requires no infrastructure beyond a two-primitive query protocol, and is readable by both humans and agents without tooling. We formalize the Document Consumption Problem, characterise six structural properties no existing format satisfies simultaneously, and prove OBJECTGRAPH satisfies all six. We further introduce the Progressive Disclosure Model, the Role-Scoped Access Protocol, and Executable Assertion Nodes as native format primitives. Empirical evaluation across five document classes and eight agent task types demonstrates up to 95.3 percent token reduction with no statistically significant degradation in task accuracy (p > 0.05). Transpiler fidelity reaches 98.7 percent content preservation on a held-out document benchmark.
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SkillPager: Query-Adaptive Intra-Skill Navigation via Semantic Node Retrieval
SkillPager retrieves typed semantic nodes from skill documents via MMR to reach 78.89% LLM-judged sufficiency with 47% fewer tokens than full documents on a 395-skill benchmark.