The Stepwise Informativeness Assumption explains the correlation between LLM entropy dynamics and reasoning correctness by positing that correct traces accumulate answer-relevant information stepwise during generation.
CoRR , volume =
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
FPRM is a Transformer-based model using fixed-point convergence for adaptive halting in looped architectures, claimed effective on Sudoku, Maze, state-tracking, and ARC-AGI benchmarks.
CoT traces align with internal answer commitment in only 61.9% of steps on average, dominated by confabulated continuations after commitment has stabilized.
Reasoning budget in LRMs functions as a generation ceiling rather than a real-time dial, leaving cognitive cost alignment with humans invariant across effort levels and supporting a training-time compiled account.
citing papers explorer
-
The Stepwise Informativeness Assumption: Why are Entropy Dynamics and Reasoning Correlated in LLMs?
The Stepwise Informativeness Assumption explains the correlation between LLM entropy dynamics and reasoning correctness by positing that correct traces accumulate answer-relevant information stepwise during generation.
-
Fixed-Point Reasoners: Stable and Adaptive Deep Looped Transformers
FPRM is a Transformer-based model using fixed-point convergence for adaptive halting in looped architectures, claimed effective on Sudoku, Maze, state-tracking, and ARC-AGI benchmarks.
-
When Reasoning Traces Become Performative: Step-Level Evidence that Chain-of-Thought Is an Imperfect Oversight Channel
CoT traces align with internal answer commitment in only 61.9% of steps on average, dominated by confabulated continuations after commitment has stabilized.
-
Effort as Ceiling, Not Dial: Reasoning Budget Does Not Modulate Cognitive Cost Alignment Between Humans and Large Reasoning Models
Reasoning budget in LRMs functions as a generation ceiling rather than a real-time dial, leaving cognitive cost alignment with humans invariant across effort levels and supporting a training-time compiled account.