VG-CoT is a new scalable dataset and three-axis benchmark that improves grounded chain-of-thought reasoning in LVLMs by explicitly tying each reasoning step to visual evidence.
Additionally, for the GQA dataset, the rich scene graph information provided by the dataset itself is leveraged as supplementary initial evidence
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought
VG-CoT is a new scalable dataset and three-axis benchmark that improves grounded chain-of-thought reasoning in LVLMs by explicitly tying each reasoning step to visual evidence.