Controlled empirical study shows correcting Wikipedia data coverage yields larger gains than algorithm differences in LLM search agent training, with outcome-based rewards competitive.
- Be concise but complete in your final answer
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Retrieval, Reward, and Training Protocols: What Matters in Training Search Agents?
Controlled empirical study shows correcting Wikipedia data coverage yields larger gains than algorithm differences in LLM search agent training, with outcome-based rewards competitive.