Dvqa: Understanding data visualizations via question answering

Kushal Kafle, Brian Price, Scott Cohen, Christopher Kanan · 2018

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

cs.CV · 2025-06-13 · unverdicted · novelty 7.0

VGR introduces a visual-grounded reasoning MLLM that detects and replays image regions during inference, achieving gains on visual benchmarks with 30% fewer image tokens than the LLaVA-NeXT-7B baseline.

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

cs.CV · 2024-04-22 · unverdicted · novelty 6.0

SEED-X is a unified multimodal foundation model that handles multi-granularity visual semantics for both comprehension and generation across arbitrary image sizes and ratios.

citing papers explorer

Showing 2 of 2 citing papers.

VGR: Visual Grounded Reasoning cs.CV · 2025-06-13 · unverdicted · none · ref 17
VGR introduces a visual-grounded reasoning MLLM that detects and replays image regions during inference, achieving gains on visual benchmarks with 30% fewer image tokens than the LLaVA-NeXT-7B baseline.
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation cs.CV · 2024-04-22 · unverdicted · none · ref 70
SEED-X is a unified multimodal foundation model that handles multi-granularity visual semantics for both comprehension and generation across arbitrary image sizes and ratios.

Dvqa: Understanding data visualizations via question answering

fields

years

verdicts

representative citing papers

citing papers explorer