SpatiaLab benchmark shows state-of-the-art VLMs achieve 54.93% accuracy on multiple-choice spatial reasoning in real scenes versus 87.57% for humans.
The slats break the light into parallel bands, producing the striped pattern
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild?
SpatiaLab benchmark shows state-of-the-art VLMs achieve 54.93% accuracy on multiple-choice spatial reasoning in real scenes versus 87.57% for humans.