ReVSI rebuilds 3D spatial reasoning benchmarks for VLMs by re-annotating objects and geometry across 381 scenes and creating verified QA pairs that match actual model inputs like 16-64 frames.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning
ReVSI rebuilds 3D spatial reasoning benchmarks for VLMs by re-annotating objects and geometry across 381 scenes and creating verified QA pairs that match actual model inputs like 16-64 frames.