OVBEVSeg enables open-vocabulary BEV segmentation via 2D-to-BEV pseudo-labeling, joint per-scene optimization, and 3D distillation, outperforming closed-set methods by 15.3 mIoU on unseen nuScenes categories while using less memory and running faster.
Open-vocabulary 3d detection via image-level class and debiased cross-modal contrastive learning
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 2roles
background 1polarities
background 1representative citing papers
Set-of-Mark prompting marks segmented image regions with alphanumerics and masks to let GPT-4V achieve state-of-the-art zero-shot results on referring expression comprehension and segmentation benchmarks like RefCOCOg.
citing papers explorer
-
Open-Vocabulary BEV Segmentation with 3D-Aware Geometric Constraints
OVBEVSeg enables open-vocabulary BEV segmentation via 2D-to-BEV pseudo-labeling, joint per-scene optimization, and 3D distillation, outperforming closed-set methods by 15.3 mIoU on unseen nuScenes categories while using less memory and running faster.
-
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Set-of-Mark prompting marks segmented image regions with alphanumerics and masks to let GPT-4V achieve state-of-the-art zero-shot results on referring expression comprehension and segmentation benchmarks like RefCOCOg.