Semi-CoT selects low-entropy pseudo-CoT chains from unlabeled questions via answer-level semantic entropy and shows high pseudo-answer precision but only small or negative gains on math reasoning benchmarks.
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results.Advances in neural information processing systems, 30, 2017
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Revisiting Chain-of-Thought Reasoning under Limited Supervision: Semi-supervised Chain-of-Thought Learning
Semi-CoT selects low-entropy pseudo-CoT chains from unlabeled questions via answer-level semantic entropy and shows high pseudo-answer precision but only small or negative gains on math reasoning benchmarks.