Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers
read the original abstract
Hybrid reasoning language models are commonly controlled through high-level Think/No-think instructions to regulate reasoning behavior, yet we found that such mode switching is largely driven by a small set of trigger tokens rather than the instructions themselves. Through attention analysis and controlled prompting experiments, we show that a leading ``Okay'' token induces reasoning behavior, while the newline pattern following ``</think>'' suppresses it. Based on this observation, we propose Mid-Think, a simple training-free prompting format that combines these triggers to achieve intermediate-budget reasoning, consistently outperforming fixed-token and prompt-based baselines in terms of the accuracy-length trade-off. Furthermore, applying Mid-Think to RL training after SFT reduces training time by approximately 15% while improving final performance of Qwen3-8B on AIME from 69.8% to 72.4% and on GPQA from 58.5% to 61.1%, demonstrating its effectiveness for both inference-time control and RL-based reasoning training.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
CausalGuard: Conformal Inference under Graph Uncertainty
CausalGuard aggregates LLM-proposed and data-pruned DAGs to weight doubly robust pseudo-outcomes and applies conformal calibration to deliver finite-sample marginal coverage for conditional average treatment effects u...
-
Reliability-Gated Source Anchoring for Continual Test-Time Adaptation
RMemSafe gates source anchoring via entropy in CTTA, reducing error by 1.05pp on ResNet-50 when source accuracy collapses and showing shallower degradation slope than prior methods.
-
Reliability-Gated Source Anchoring for Continual Test-Time Adaptation
RMemSafe attenuates source anchoring via entropy gating when the frozen source model degrades, yielding lower error than prior methods on continual corruption benchmarks and shallower degradation under source failure.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.