Knowledge-graph paths reused as intermediate supervision improve self-evolving search agents over standard Search Self-Play on seven QA benchmarks by supplying relational context and graded waypoint rewards.
Near/after the divergence point, add constraints that exclude distractors
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Knowledge-Graph Paths as Intermediate Supervision for Self-Evolving Search Agents
Knowledge-graph paths reused as intermediate supervision improve self-evolving search agents over standard Search Self-Play on seven QA benchmarks by supplying relational context and graded waypoint rewards.