Zero-shot prompting reaches 59% accuracy at moderate temperatures while chain-of-thought prompting excels at temperature extremes on Olympiad-level math problems, with extended reasoning gains scaling to 14.3x at high temperature.
The effect of sampling temperature on prob- lem solving in large language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models
Zero-shot prompting reaches 59% accuracy at moderate temperatures while chain-of-thought prompting excels at temperature extremes on Olympiad-level math problems, with extended reasoning gains scaling to 14.3x at high temperature.