Entropy minimization on self-generated outputs elicits strong reasoning in pretrained LLMs, matching or exceeding supervised RL methods on benchmarks.
The entropy enigma: Success and failure of entropy minimization
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
MER-DG applies modality-entropy regularization to reduce fusion overfitting in multimodal domain generalization, reporting average gains of 5% over standard fusion and 2% over prior methods on EPIC-Kitchens and HAC benchmarks.
citing papers explorer
-
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Entropy minimization on self-generated outputs elicits strong reasoning in pretrained LLMs, matching or exceeding supervised RL methods on benchmarks.
-
MER-DG: Modality-Entropy Regularization for Multimodal Domain Generalization
MER-DG applies modality-entropy regularization to reduce fusion overfitting in multimodal domain generalization, reporting average gains of 5% over standard fusion and 2% over prior methods on EPIC-Kitchens and HAC benchmarks.