VLMs violate their own stated introspective rules for attributing colors to objects in nearly 60% of cases on items with strong color priors, unlike humans who largely follow theirs, revealing miscalibrated self-knowledge.
arXiv preprint arXiv:2404.18624 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Clinical VLMs over-rely on text modality, irrelevant clinical history, and prompt wording when making chest x-ray decisions on MIMIC-CXR data.
A new attention-enhancement method using ARS scores and RVE reduces action-relation hallucinations in LVLMs while generalizing to spatial and object hallucinations.
citing papers explorer
-
When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't
VLMs violate their own stated introspective rules for attributing colors to objects in nearly 60% of cases on items with strong color priors, unlike humans who largely follow theirs, revealing miscalibrated self-knowledge.
-
Medical Context Distorts Decisions in Clinical Vision Language Models
Clinical VLMs over-rely on text modality, irrelevant clinical history, and prompt wording when making chest x-ray decisions on MIMIC-CXR data.
-
Mitigating Action-Relation Hallucinations in LVLMs via Relation-aware Visual Enhancement
A new attention-enhancement method using ARS scores and RVE reduces action-relation hallucinations in LVLMs while generalizing to spatial and object hallucinations.