Making the v in vqa matter: Elevating the role of image understanding in visual question answering,

Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Batra, Devi Parikh, “Making the v in vqa matter: Elevating the role of image understanding in visual question answering,” inProceedings of the IEEE conference on computer vision, pa · 2017

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

AeroRAG: Structured Multimodal Retrieval-Augmented LLM for Fine-Grained Aerial Visual Reasoning

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

AeroRAG improves fine-grained aerial visual question answering by converting images to scene graphs and using retrieval-augmented generation to create compact LLM prompts.

citing papers explorer

Showing 1 of 1 citing paper.

AeroRAG: Structured Multimodal Retrieval-Augmented LLM for Fine-Grained Aerial Visual Reasoning cs.CV · 2026-04-20 · unverdicted · none · ref 39
AeroRAG improves fine-grained aerial visual question answering by converting images to scene graphs and using retrieval-augmented generation to create compact LLM prompts.

Making the v in vqa matter: Elevating the role of image understanding in visual question answering,

fields

years

verdicts

representative citing papers

citing papers explorer