SpatialMosaic introduces a 2M-pair multi-view QA dataset and 1M-pair benchmark for MLLMs on spatial reasoning under partial visibility, plus a hybrid baseline that integrates 3D reconstruction models as geometry encoders.
Scannet: Richly-annotated 3d reconstructions of indoor scenes
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2025 2verdicts
UNVERDICTED 2representative citing papers
GraphFusion3D reports improved 3D object detection accuracy on SUN RGB-D and ScanNetV2 by combining adaptive image-to-point fusion with multi-scale graph reasoning on proposals.
citing papers explorer
-
SpatialMosaic: A Multiview VLM Dataset for Partial Visibility
SpatialMosaic introduces a 2M-pair multi-view QA dataset and 1M-pair benchmark for MLLMs on spatial reasoning under partial visibility, plus a hybrid baseline that integrates 3D reconstruction models as geometry encoders.
-
GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection
GraphFusion3D reports improved 3D object detection accuracy on SUN RGB-D and ScanNetV2 by combining adaptive image-to-point fusion with multi-scale graph reasoning on proposals.