Where To Look: Focus Regions for Visual Question Answering

Derek Hoiem; Kevin J. Shih; Saurabh Singh

arxiv: 1511.07394 · v2 · pith:VJIXI4WLnew · submitted 2015-11-23 · 💻 cs.CV

Where To Look: Focus Regions for Visual Question Answering

Kevin J. Shih , Saurabh Singh , Derek Hoiem This is my paper

classification 💻 cs.CV

keywords answeringregionsvisualdatasetimagemethodquestionquestions

0 comments

read the original abstract

We present a method that learns to answer visual questions by selecting image regions relevant to the text-based query. Our method exhibits significant improvements in answering questions such as "what color," where it is necessary to evaluate a specific location, and "what room," where it selectively identifies informative image regions. Our model is tested on the VQA dataset which is the largest human-annotated visual question answering dataset to our knowledge.

This paper has not been read by Pith yet.

Where To Look: Focus Regions for Visual Question Answering

discussion (0)