LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals

· 2026 · cs.CL · arXiv 2604.05655

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open full Pith review browse 7 citing papers arXiv PDF

abstract

This work characterizes large language models' chain-of-thought generation as a structured trajectory through representation space. We show that mathematical reasoning traverses functionally ordered, step-specific subspaces that become increasingly separable with layer depth. This structure already exists in base models, while reasoning training primarily accelerates convergence toward termination-related subspaces rather than introducing new representational organization. While early reasoning steps follow similar trajectories, correct and incorrect solutions diverge systematically at late stages. This late-stage divergence enables mid-reasoning prediction of final-answer correctness with ROC-AUC up to 0.87. Furthermore, we introduce trajectory-based steering, an inference-time intervention framework that enables reasoning correction and length control based on derived ideal trajectories. Together, these results establish reasoning trajectories as a geometric lens for interpreting, predicting, and controlling LLM reasoning behavior.

representative citing papers

SliceGraph: Mapping Process Isomers in Multi-Run Chain-of-Thought Reasoning

cs.AI · 2026-05-14 · unverdicted · novelty 7.0

SliceGraph maps process isomers in multi-run CoT reasoning, finding that 85.5% of 954 problem-model cells show correct trajectories splitting into multiple process families with 76.6% of run pairs cross-family on average.

Uncovering the Representation Geometry of Minimal Cores in Overcomplete Reasoning Traces

cs.AI · 2026-05-14 · unverdicted · novelty 7.0

Language models produce overcomplete reasoning traces where on average 46% of steps can be removed while preserving the answer in 86% of cases, with necessity concentrated in the top three steps.

Semantic Step Prediction: Multi-Step Latent Forecasting in LLM Reasoning Trajectories via Step Sampling

cs.LG · 2026-04-20 · unverdicted · novelty 7.0

Applying STP at consecutive semantic reasoning steps achieves 168x more accurate multi-step latent prediction on ProcessBench than frozen baselines, with trajectories forming smooth curves best captured by non-linear predictors.

LoRi: Low-Rank Distillation for Implicit Reasoning

cs.CL · 2026-06-03 · unverdicted · novelty 6.0

LoRi distills implicit chain-of-thought by matching low-rank structures in hidden states, raising math-reasoning accuracy toward explicit CoT levels on LLaMA and Qwen models.

Right Makes Might: Aligning Verified Hidden States Empowers RL Reasoning

cs.LG · 2026-06-02 · unverdicted · novelty 6.0

Hidden-Align adds an auxiliary loss to align hidden states of correct reasoning paths at the pre-answer token in RLVR, improving pass@1 by 3.8-6.2 points over DAPO on eight math benchmarks for Qwen3 models of 1.7B-14B scale.

Hypothesis generation and updating in large language models

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

LLMs exhibit Bayesian-like hypothesis updating with strong-sampling bias and an evaluation-generation gap but generalize poorly outside observed data.

Geometric Signatures of Reasoning: A Spectral Perspective on Task Hardness

cs.LG · 2026-07-02 · unverdicted · novelty 5.0

Introduces effective dimension d_ρ from spectral analysis of reasoning trajectories to distinguish task hardness (0.93 AUC on MATH500) and uses kinematic features for early correctness prediction from partial generations.

citing papers explorer

Showing 7 of 7 citing papers after filters.

SliceGraph: Mapping Process Isomers in Multi-Run Chain-of-Thought Reasoning cs.AI · 2026-05-14 · unverdicted · none · ref 10 · internal anchor
SliceGraph maps process isomers in multi-run CoT reasoning, finding that 85.5% of 954 problem-model cells show correct trajectories splitting into multiple process families with 76.6% of run pairs cross-family on average.
Uncovering the Representation Geometry of Minimal Cores in Overcomplete Reasoning Traces cs.AI · 2026-05-14 · unverdicted · none · ref 54 · internal anchor
Language models produce overcomplete reasoning traces where on average 46% of steps can be removed while preserving the answer in 86% of cases, with necessity concentrated in the top three steps.
Semantic Step Prediction: Multi-Step Latent Forecasting in LLM Reasoning Trajectories via Step Sampling cs.LG · 2026-04-20 · unverdicted · none · ref 8 · internal anchor
Applying STP at consecutive semantic reasoning steps achieves 168x more accurate multi-step latent prediction on ProcessBench than frozen baselines, with trajectories forming smooth curves best captured by non-linear predictors.
LoRi: Low-Rank Distillation for Implicit Reasoning cs.CL · 2026-06-03 · unverdicted · none · ref 12 · internal anchor
LoRi distills implicit chain-of-thought by matching low-rank structures in hidden states, raising math-reasoning accuracy toward explicit CoT levels on LLaMA and Qwen models.
Right Makes Might: Aligning Verified Hidden States Empowers RL Reasoning cs.LG · 2026-06-02 · unverdicted · none · ref 24 · internal anchor
Hidden-Align adds an auxiliary loss to align hidden states of correct reasoning paths at the pre-answer token in RLVR, improving pass@1 by 3.8-6.2 points over DAPO on eight math benchmarks for Qwen3 models of 1.7B-14B scale.
Hypothesis generation and updating in large language models cs.LG · 2026-05-07 · unverdicted · none · ref 21 · internal anchor
LLMs exhibit Bayesian-like hypothesis updating with strong-sampling bias and an evaluation-generation gap but generalize poorly outside observed data.
Geometric Signatures of Reasoning: A Spectral Perspective on Task Hardness cs.LG · 2026-07-02 · unverdicted · none · ref 41 · internal anchor
Introduces effective dimension d_ρ from spectral analysis of reasoning trajectories to distinguish task hardness (0.93 AUC on MATH500) and uses kinematic features for early correctness prediction from partial generations.

LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals

fields

years

verdicts

representative citing papers

citing papers explorer