Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought

· 2025 · cs.LG · arXiv 2510.24941

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

open full Pith review browse 5 citing papers arXiv PDF

abstract

Large language models can generate long chain-of-thought (CoT) reasoning, yet prior work suggests that CoT can be post-hoc rationalization rather than a faithful reflection of the computation through explicitly designed settings. In this work, we go further and propose a True Thinking Score (TTS) to quantify the causal contribution of each step in CoT to the model's final prediction in realistic reasoning problems. Across eleven models ranging from 1.5B to 1.1T parameters on common reasoning benchmarks, we find that CoTs often interleave true-thinking steps, which causally affect the final answer, with decorative-thinking steps, which appear useful but have little causal influence; Such decorative steps remain prevalent even for frontier models: Over 30% of steps in Kimi-K2.6 are decorative on MATH with TTS <= 0.005. Furthermore, TTS enables effective CoT pruning: removing 50% of CoT steps with the lowest TTS can largely maintain the performance. Self-training on these pruned CoTs reduces reasoning length by 66% while preserving performance on Nemotron3-Nano-30B. Finally, we provide a mechanistic analysis showing that LLMs can be steered in the latent space to engage or disengage with reasoning steps. Overall, our results reveal that frontier LLMs often verbalize reasoning steps that are not causally used, challenging both the efficiency and the trustworthiness of CoT.

representative citing papers

Observable Patterns Are Not Explanations: A Causal-Geometric Analysis of Latent Reasoning Models

cs.CL · 2026-06-10 · unverdicted · novelty 7.0

Evaluation of two latent reasoning models against controls shows observable latent patterns appear without the proposed mechanisms, have graded causal effects on behavior, and concentrate in structured low-rank directions, arguing that patterns are insufficient evidence for reasoning.

ReasoningFlow: Discourse Structures for Understanding LLM Reasoning Traces

cs.CL · 2026-06-03 · unverdicted · novelty 7.0

ReasoningFlow represents LLM reasoning traces as DAGs, finding structural similarity across models and that most erroneous steps are unused in final answers.

When Chain-of-Thought Fails, the Solution Hides in the Hidden States

cs.CL · 2026-04-25 · unverdicted · novelty 7.0

Activation patching shows individual CoT tokens encode sufficient task-relevant information to recover correct answers on GSM8K, often outperforming both direct prompting and the original (sometimes incorrect) CoT trace.

Measuring and curing reasoning rigidity: from decorative chain-of-thought to genuine faithfulness

cs.CL · 2026-03-24 · unverdicted · novelty 6.0

SLRC quantifies genuine step necessity in LLM reasoning as a causal estimator, LC-CoSR training reduces rigidity with stability guarantees, and evaluations reveal a faithfulness-sycophancy paradox across frontier models.

Decoding the Critique Mechanism in Large Reasoning Models

cs.LG · 2026-03-17 · unverdicted · novelty 6.0

By injecting arithmetic mistakes into CoT reasoning, the paper identifies a hidden critique ability in LRMs and extracts a steerable critique vector that enhances self-correction across model scales.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Decoding the Critique Mechanism in Large Reasoning Models cs.LG · 2026-03-17 · unverdicted · none · ref 12 · internal anchor
By injecting arithmetic mistakes into CoT reasoning, the paper identifies a hidden critique ability in LRMs and extracts a steerable critique vector that enhances self-correction across model scales.

Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought

fields

years

verdicts

representative citing papers

citing papers explorer