Linear probes detect task format confounds rather than distinct reasoning modes in LLM hidden states across LogiQA, ARC, and αNLI benchmarks.
Preprint, arXiv:2603.09200
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States
Linear probes detect task format confounds rather than distinct reasoning modes in LLM hidden states across LogiQA, ARC, and αNLI benchmarks.