Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models

Cui, Yingqian, He, Pengfei, Zeng, Jingying, Liu, Hui, Tang, Xianfeng, Dai, Zhenwei · 2025 · DOI 10.18653/v1/2025.findings-acl.956

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating

cs.LG · 2026-06-05 · unverdicted · novelty 5.0

SlimSearcher reduces tool-call rounds by 17-58% on GAIA, BrowseComp and XBenchDeepSearch while maintaining accuracy via Pareto filtration in SFT and Adaptive Reward Gating in RL.

citing papers explorer

Showing 1 of 1 citing paper after filters.

SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating cs.LG · 2026-06-05 · unverdicted · none · ref 42
SlimSearcher reduces tool-call rounds by 17-58% on GAIA, BrowseComp and XBenchDeepSearch while maintaining accuracy via Pareto filtration in SFT and Adaptive Reward Gating in RL.

Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models

fields

years

verdicts

representative citing papers

citing papers explorer