LLMs encode causal direction internally via probes but revert to commonsense in Yes/No outputs on anti-commonsense items, showing output accuracy alone does not measure causal understanding.
Preprint arXiv:2005.13407 (2020); journal version 2021
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Causal Tongue-Tie: LLMs Can Encode Causal Direction, But Their Yes/No Outputs Fail to Express
LLMs encode causal direction internally via probes but revert to commonsense in Yes/No outputs on anti-commonsense items, showing output accuracy alone does not measure causal understanding.