Self-Training Elicits Concise Reasoning in Large Language Models

Tergel Munkhbat, Namgyu Ho, Seo Hyun Kim, Yongjin Yang, Yujin Kim, Se-Young Yun · 2025 · DOI 10.18653/v1/2025.findings-acl.1289

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation

cs.LG · 2026-06-04 · unverdicted · novelty 6.0

Post-hoc model-based compression of reasoning traces cuts training tokens to 12-30% and speeds training 2-7.6x while retaining up to 96% of raw-trace accuracy, though raw traces remain superior at every scale.

SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating

cs.LG · 2026-06-05 · unverdicted · novelty 5.0

SlimSearcher reduces tool-call rounds by 17-58% on GAIA, BrowseComp and XBenchDeepSearch while maintaining accuracy via Pareto filtration in SFT and Adaptive Reward Gating in RL.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation cs.LG · 2026-06-04 · unverdicted · none · ref 25
Post-hoc model-based compression of reasoning traces cuts training tokens to 12-30% and speeds training 2-7.6x while retaining up to 96% of raw-trace accuracy, though raw traces remain superior at every scale.
SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating cs.LG · 2026-06-05 · unverdicted · none · ref 41
SlimSearcher reduces tool-call rounds by 17-58% on GAIA, BrowseComp and XBenchDeepSearch while maintaining accuracy via Pareto filtration in SFT and Adaptive Reward Gating in RL.

Self-Training Elicits Concise Reasoning in Large Language Models

fields

years

verdicts

representative citing papers

citing papers explorer