LLM reasoning failures cluster at early entropy-spike transitions; the GUARD inference-time framework redirects them for more reliable results.
In high-precision geometry (AIME 2024), smaller models likeDeepSeek-R1-Distill-Qwen-1.5Bof- ten falter when facing complex arithmetic
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Dissecting Failure Dynamics in Large Language Model Reasoning
LLM reasoning failures cluster at early entropy-spike transitions; the GUARD inference-time framework redirects them for more reliable results.