MuRGAt benchmark reveals that strong multimodal models frequently hallucinate citations in complex reasoning tasks despite correct answers, exposing a gap between internal reasoning and verifiable attribution.
The video shows
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Multimodal Fact-Level Attribution for Verifiable Reasoning
MuRGAt benchmark reveals that strong multimodal models frequently hallucinate citations in complex reasoning tasks despite correct answers, exposing a gap between internal reasoning and verifiable attribution.