An open-vocabulary pipeline anchors functional edges via 2D visual grounding then uses temporal 3D graph optimization with evidence accumulation and entropy regularization to build hierarchical scene graphs for dense indoor scenes.
In: CVPR (2024)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Dual use of SAM for broader target pixel learning and DINOv3 for domain-invariant prototypes yields +1.3% and +1.4% mIoU gains over baselines on GTA-to-Cityscapes and SYNTHIA-to-Cityscapes.
citing papers explorer
-
Hierarchical and Holistic Open-Vocabulary Functional 3D Scene Graphs for Indoor Spaces
An open-vocabulary pipeline anchors functional edges via 2D visual grounding then uses temporal 3D graph optimization with evidence accumulation and entropy regularization to build hierarchical scene graphs for dense indoor scenes.
-
Dual-Foundation Models for Unsupervised Domain Adaptation
Dual use of SAM for broader target pixel learning and DINOv3 for domain-invariant prototypes yields +1.3% and +1.4% mIoU gains over baselines on GTA-to-Cityscapes and SYNTHIA-to-Cityscapes.