Distillation from visual foundation models to lidar enables frame-wise indoor semantic segmentation without manual annotations, achieving up to 56% mIoU on pseudo labels and 36% on real labels.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
HGC-Det applies hyperbolic geometry to constrain cross-modal distillation between images and point clouds, with added semantic-guided voxel optimization and feature aggregation, yielding improved accuracy-efficiency trade-offs on SUN RGB-D, ARKitScenes, KITTI, and nuScenes.
The paper offers a taxonomy of 2D-to-3D adaptation strategies divided into data-centric projection, architecture-centric 3D networks, and hybrid methods that combine both.
citing papers explorer
-
Feasibility of Indoor Frame-Wise Lidar Semantic Segmentation via Distillation from Visual Foundation Model
Distillation from visual foundation models to lidar enables frame-wise indoor semantic segmentation without manual annotations, achieving up to 56% mIoU on pseudo labels and 36% on real labels.
-
Hyperbolic Distillation: Geometry-Guided Cross-Modal Transfer for Robust 3D Object Detection
HGC-Det applies hyperbolic geometry to constrain cross-modal distillation between images and point clouds, with added semantic-guided voxel optimization and feature aggregation, yielding improved accuracy-efficiency trade-offs on SUN RGB-D, ARKitScenes, KITTI, and nuScenes.
-
Bridging the Dimensionality Gap: A Taxonomy and Survey of 2D Vision Model Adaptation for 3D Analysis
The paper offers a taxonomy of 2D-to-3D adaptation strategies divided into data-centric projection, architecture-centric 3D networks, and hybrid methods that combine both.