Measuring and Mitigating Post-hoc Rationalization in Reverse Chain-of-Thought Generation

Chen Yang; Guangyue Peng; Houfeng Wang; Ran Le; Ruixiang Feng; Tao Zhang; Wei Li; Wen Luo; Yang Song; Yuntao Wen

read the original abstract

Reverse Chain-of-Thought Generation (RCG) synthesizes reasoning traces from query-answer pairs, but it risks producing post-hoc rationalizations: when models can see the answer during generation, the answer serves as a cognitive anchor that shapes the entire explanation. We formalize this phenomenon through a three-level measurement hierarchy: lexical, entropic, and probabilistic anchoring, which capture surface artifacts, entropy dynamics, and latent answer dependence, respectively. We analyze semantic suppression, the intuitive mitigation strategy that instructs models to ignore the answer, and find that it is counterproductive: while it reduces lexical overlap, it paradoxically increases entropic and probabilistic anchoring. We attribute this failure to active monitoring of the forbidden answer, which inadvertently deepens dependence on it. To break this cycle, we propose Structural Skeleton-guided Reasoning (SSR), whose core contribution is to replace answer suppression with structural decoupling: SSR first generates a response-abstracted functional skeleton designed to limit direct answer encoding and then uses it as a structural target for full trace generation. Experiments across open-ended reasoning benchmarks show that SSR consistently mitigates anchoring, and that Distilled SSR (SSR-D), a distillation variant that internalizes skeleton-guided reasoning from teacher-generated traces, achieves up to 10\% improvement over suppression baselines while mitigating out-of-distribution (OOD) degradation.

Measuring and Mitigating Post-hoc Rationalization in Reverse Chain-of-Thought Generation

discussion (0)