VLMs show answer inertia in CoT reasoning and remain influenced by misleading textual cues even with sufficient visual evidence, making CoT an incomplete window into modality reliance.
Language models don't always say what they think: Unfaithful explanations in chain-of-thought prompting
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2verdicts
UNVERDICTED 2representative citing papers
AnalyticScore applies new FGTI interpretability principles to text-based scoring and achieves accuracy within 0.06 QWK of uninterpretable state-of-the-art while matching human featurization on the ASAP-SAS dataset.
citing papers explorer
-
Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models
VLMs show answer inertia in CoT reasoning and remain influenced by misleading textual cues even with sufficient visual evidence, making CoT an incomplete window into modality reliance.
-
Interpretability from the Ground Up: Stakeholder-Centric Design of Automated Scoring in Educational Assessments
AnalyticScore applies new FGTI interpretability principles to text-based scoring and achieves accuracy within 0.06 QWK of uninterpretable state-of-the-art while matching human featurization on the ASAP-SAS dataset.