Tubench: Benchmarking large vision-language models on trustworthiness with unanswerable questions.arXiv preprint arXiv:2410.04107,

Xingwei He, Qianru Zhang, A Jin, Yuan Yuan, Siu-Ming Yiu, et al · arXiv 2410.04107

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Seeing Isn't Knowing: Do VLMs Know When Not to Answer Spatial Questions (and Why)?

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

Frontier VLMs overconfidently answer spatial questions under occlusion (~30% accuracy) and perspective ambiguity (<10% accuracy) instead of abstaining, and often fail to select helpful additional views.

citing papers explorer

Showing 1 of 1 citing paper.

Seeing Isn't Knowing: Do VLMs Know When Not to Answer Spatial Questions (and Why)? cs.CV · 2026-05-28 · unverdicted · none · ref 5
Frontier VLMs overconfidently answer spatial questions under occlusion (~30% accuracy) and perspective ambiguity (<10% accuracy) instead of abstaining, and often fail to select helpful additional views.

Tubench: Benchmarking large vision-language models on trustworthiness with unanswerable questions.arXiv preprint arXiv:2410.04107,

fields

years

verdicts

representative citing papers

citing papers explorer