Corruption studies of CoT faithfulness largely measure explicit answer placement in prompt format rather than computational importance of reasoning steps.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
MOTAB is a new distillation pipeline that monitors on-policy student trajectories and backtracks with teacher intervention to mitigate dual exposure biases, improving reasoning performance by about 3%.
citing papers explorer
-
The Last Word Often Wins: A Format Confound in Chain-of-Thought Corruption Studies
Corruption studies of CoT faithfulness largely measure explicit answer placement in prompt format rather than computational importance of reasoning steps.
-
Backtracking When It Strays: Mitigating Dual Exposure Biases in LLM Reasoning Distillation
MOTAB is a new distillation pipeline that monitors on-policy student trajectories and backtracks with teacher intervention to mitigate dual exposure biases, improving reasoning performance by about 3%.