Real-Time Progress Prediction in Reasoning Language Models
read the original abstract
Recent reasoning language models, particularly those that employ long latent chains of thought, achieve strong performance on complex agentic tasks. However, as these models operate over increasingly long time horizons, their internal progress becomes opaque to users, making expectation management and real-time oversight difficult. In this work, we investigate whether real-time progress prediction is feasible for such models. We first test whether hidden states encode progress information by discretizing reasoning trajectories and training a linear probe to classify reasoning states. We then fine-tune models to generate progress estimates from 0--100\% during chain-of-thought reasoning. Our strongest progress-reporting checkpoint reaches 0.161 MAE on mathematical reasoning traces and outperforms position baselines in this setting. Finally, we quantify the intrinsic ambiguity of progress labels by measuring how much the implied progress value varies from the same partial rollout. This ambiguity is lowest for Qwen3-4B, whose continuations produce the smallest rollout dispersion, suggesting that larger models can make progress labels more stable by reducing variation in remaining solution length.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Hypothesis generation and updating in large language models
LLMs exhibit Bayesian-like hypothesis updating with strong-sampling bias and an evaluation-generation gap but generalize poorly outside observed data.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.