SlimSearcher reduces tool-call rounds by 17-58% on GAIA, BrowseComp and XBenchDeepSearch while maintaining accuracy via Pareto filtration in SFT and Adaptive Reward Gating in RL.
Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating
SlimSearcher reduces tool-call rounds by 17-58% on GAIA, BrowseComp and XBenchDeepSearch while maintaining accuracy via Pareto filtration in SFT and Adaptive Reward Gating in RL.