Empirical evaluation on Gemini 2.5 models shows self-consistency yields only 0.4% gain on HotpotQA and 1.6% on MATH-500 across 20 samples while token costs scale linearly, with performance plateauing or declining at higher counts.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Self-Consistency Is Losing Its Edge: Diminishing Returns and Rising Costs in Modern LLMs
Empirical evaluation on Gemini 2.5 models shows self-consistency yields only 0.4% gain on HotpotQA and 1.6% on MATH-500 across 20 samples while token costs scale linearly, with performance plateauing or declining at higher counts.