Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures

Alexander Panchenko; Elena Tutubalina; Gleb Ershov; Karim Vafin; Mikhail Chaichuk; Mikhail Seleznyov; Oleg Somov

read the original abstract

In schema-guided reasoning (SGR) pipelines, LLMs produce explicit intermediate structures -- rubrics, checklists, or verification queries -- before committing to a final decision. SGR is increasingly adopted because it promises controllability: practitioners expect to inspect, edit, and override these structures to steer the outcome. But does the promise hold? We introduce a causal evaluation protocol to measure it: by selecting tasks where a deterministic function maps intermediate structures to decisions, every controlled edit implies a unique correct output. Across 12 models and 4 benchmarks, models appear self-consistent with their own intermediate structures but fail to update predictions after intervention -- revealing that apparent faithfulness is fragile once the intermediate structure changes. When derivation of the final decision from the structure is delegated to an external tool, this fragility largely disappears; stronger prompting yields only limited improvements, while preference optimization substantially improves intervention faithfulness. Overall, intermediate structures in schema-guided pipelines function as influential context rather than stable causal mediators.

Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures

discussion (0)