pith. machine review for the scientific record. sign in

arxiv: 1801.08163 · v2 · submitted 2018-01-24 · 💻 cs.CV · cs.GR

Recognition: unknown

DVQA: Understanding Data Visualizations via Question Answering

Authors on Pith no claims yet
classification 💻 cs.CV cs.GR
keywords dvqaalgorithmsansweringquestionchartchartsinformationmany
0
0 comments X
read the original abstract

Bar charts are an effective way to convey numeric information, but today's algorithms cannot parse them. Existing methods fail when faced with even minor variations in appearance. Here, we present DVQA, a dataset that tests many aspects of bar chart understanding in a question answering framework. Unlike visual question answering (VQA), DVQA requires processing words and answers that are unique to a particular bar chart. State-of-the-art VQA algorithms perform poorly on DVQA, and we propose two strong baselines that perform considerably better. Our work will enable algorithms to automatically extract numeric and semantic information from vast quantities of bar charts found in scientific publications, Internet articles, business reports, and many other areas.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. GENFIG1: Visual Summaries of Scholarly Work as a Challenge for Vision-Language Models

    cs.CV 2026-04 unverdicted novelty 7.0

    GENFIG1 is a new benchmark that tests whether vision-language models can create effective Figure 1 visuals capturing the central scientific idea from paper text.

  2. Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models

    cs.AI 2026-04 unverdicted novelty 6.0

    Chart-RL uses RL policy optimization and LoRA to boost VLM chart reasoning, enabling a 4B model to reach 0.634 accuracy versus 0.580 for an 8B model with lower latency.