DISSECT benchmark reveals that VLMs extract visual details from scientific diagrams but frequently lose them during reasoning, with open-source models showing a larger integration gap than closed-source ones.
- Note the position of each label relative to the structure it annotates
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
DISSECT: Diagnosing Where Vision Ends and Language Priors Begin in Scientific VLMs
DISSECT benchmark reveals that VLMs extract visual details from scientific diagrams but frequently lose them during reasoning, with open-source models showing a larger integration gap than closed-source ones.