o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.
The Eleventh International Conference on Learning Representations , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
RADAR generates query-adaptive multi-agent communication structures via conditional discrete graph diffusion guided by effective graph size, outperforming baselines on accuracy and token consumption across six benchmarks.
citing papers explorer
No citing papers match the current filters.