V oqa: Visual-only question answering.arXiv preprint arXiv:2505.14227, 2025a

An, J · arXiv 2505.14227

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text?

cs.CV · 2026-02-04 · conditional · novelty 7.0

VISTA-Bench shows vision-language models degrade on visualized text in images compared to equivalent pure text, with larger gaps under increased perceptual difficulty.

citing papers explorer

Showing 1 of 1 citing paper.

VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text? cs.CV · 2026-02-04 · conditional · none · ref 1
VISTA-Bench shows vision-language models degrade on visualized text in images compared to equivalent pure text, with larger gaps under increased perceptual difficulty.

V oqa: Visual-only question answering.arXiv preprint arXiv:2505.14227, 2025a

fields

years

verdicts

representative citing papers

citing papers explorer