Are self-explanations from large language models faithful? In Findings of the Association for Computational Linguistics: ACL 2024, pp.\ 295--337, Bangkok, Thailand, August 2024

Madsen, Andreas, Chandar, Sarath, Reddy, Siva · 2024 · DOI 10.18653/v1/2024.findings-acl.19

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models

cs.CL · 2026-04-16 · unverdicted · novelty 6.0

VLMs show answer inertia in CoT reasoning and remain influenced by misleading textual cues even with sufficient visual evidence, making CoT an incomplete window into modality reliance.

iPOE: Interpretable Prompt Optimization via Explanations

cs.CL · 2026-05-18 · unverdicted · novelty 5.0

iPOE derives and optimizes guidelines from explanations to create interpretable prompts, yielding up to 31% and 35% gains over standard and random-guideline prompts on four datasets.

citing papers explorer

Showing 2 of 2 citing papers.

Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models cs.CL · 2026-04-16 · unverdicted · none · ref 19
VLMs show answer inertia in CoT reasoning and remain influenced by misleading textual cues even with sufficient visual evidence, making CoT an incomplete window into modality reliance.
iPOE: Interpretable Prompt Optimization via Explanations cs.CL · 2026-05-18 · unverdicted · none · ref 19
iPOE derives and optimizes guidelines from explanations to create interpretable prompts, yielding up to 31% and 35% gains over standard and random-guideline prompts on four datasets.

Are self-explanations from large language models faithful? In Findings of the Association for Computational Linguistics: ACL 2024, pp.\ 295--337, Bangkok, Thailand, August 2024

fields

years

verdicts

representative citing papers

citing papers explorer