pith. machine review for the scientific record. sign in

arxiv: 1905.10226 · v2 · submitted 2019-05-24 · 💻 cs.CV

Recognition: unknown

Deep Reason: A Strong Baseline for Real-World Visual Reasoning

Authors on Pith no claims yet
classification 💻 cs.CV
keywords baselineperformancereal-worldreasoningstrongvisualachievesanalysis
0
0 comments X
read the original abstract

This paper presents a strong baseline for real-world visual reasoning (GQA), which achieves 60.93% in GQA 2019 challenge and won the sixth place. GQA is a large dataset with 22M questions involving spatial understanding and multi-step inference. To help further research in this area, we identified three crucial parts that improve the performance, namely: multi-source features, fine-grained encoder, and score-weighted ensemble. We provide a series of analysis on their impact on performance.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.