MM-AQA shows frontier VLMs rarely abstain on unanswerable multimodal questions, multi-agent setups improve abstention at an accuracy cost, and effective abstention needs training rather than prompting or extra agents.
(3) Semantic Unanswerability.Modifies the question while leaving all images unchanged; requires non-trivial reasoning to detect because image content is intact
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Knowing When Not to Answer: Evaluating Abstention in Multimodal Reasoning Systems
MM-AQA shows frontier VLMs rarely abstain on unanswerable multimodal questions, multi-agent setups improve abstention at an accuracy cost, and effective abstention needs training rather than prompting or extra agents.