Vizwiz grand challenge: Answering visual questions from blind people

Danna Gurari, Qing Li, Abigale J Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, Jeffrey P Bigham

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

More than the Sum: Panorama-Language Models for Adverse Omni-Scenes

cs.CV · 2026-03-10 · unverdicted · novelty 7.0

Panorama-Language Models with a sparse attention module and PanoVQA dataset deliver superior holistic reasoning on 360° adverse omni-scenes compared to stitched pinhole views.

Switch-KD: Visual-Switch Knowledge Distillation for Vision-Language Models

cs.CV · 2026-04-16 · unverdicted · novelty 6.0

A 0.5B student VLM distills from a 3B teacher using visual-switch distillation and DBiLD loss to gain 3.6 points on average across 10 multimodal benchmarks without architecture changes.

citing papers explorer

Showing 2 of 2 citing papers.

More than the Sum: Panorama-Language Models for Adverse Omni-Scenes cs.CV · 2026-03-10 · unverdicted · none · ref 10
Panorama-Language Models with a sparse attention module and PanoVQA dataset deliver superior holistic reasoning on 360° adverse omni-scenes compared to stitched pinhole views.
Switch-KD: Visual-Switch Knowledge Distillation for Vision-Language Models cs.CV · 2026-04-16 · unverdicted · none · ref 10
A 0.5B student VLM distills from a 3B teacher using visual-switch distillation and DBiLD loss to gain 3.6 points on average across 10 multimodal benchmarks without architecture changes.

Vizwiz grand challenge: Answering visual questions from blind people

fields

years

verdicts

representative citing papers

citing papers explorer