StyleBench shows greater reasoning structure complexity improves LLM accuracy only in limited regimes set by task demands and model capacity, with search styles failing on small models and RL-based selection outperforming supervised fine-tuning.
No solution Structured of Thoughts (SoT) Response: a= 2,b= 3,c= 13,d= 13 Step 1: Consider possible combinations of operations
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
StyleBench: Evaluating thinking styles in Large Language Models
StyleBench shows greater reasoning structure complexity improves LLM accuracy only in limited regimes set by task demands and model capacity, with search styles failing on small models and RL-based selection outperforming supervised fine-tuning.