arXiv preprint arXiv:2507.23134 (2025) 9, 12, 13

Jung, S · 2025 · arXiv 2507.23134

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

SpaCeFormer: Fast Proposal-Free Open-Vocabulary 3D Instance Segmentation

cs.CV · 2026-04-22 · unverdicted · novelty 6.0

SpaCeFormer delivers 11.1 zero-shot mAP on ScanNet200 (2.8x prior proposal-free best) and runs 2-3 orders of magnitude faster than multi-stage 2D+3D pipelines by using spatial window attention and Morton-curve serialization to predict instance masks from learned queries.

Cross-Attentive Multiview Fusion of Vision-Language Embeddings

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

CAMFusion fuses multiview 2D vision-language embeddings via cross-attention and multiview consistency self-supervision to produce better 3D semantic and instance representations, outperforming averaging and reaching SOTA on benchmarks including zero-shot out-of-domain cases.

MV3DIS: Multi-View Mask Matching via 3D Guides for Zero-Shot 3D Instance Segmentation

cs.CV · 2026-04-10 · unverdicted · novelty 5.0

MV3DIS uses 3D-guided mask matching and depth consistency to produce more consistent multi-view 2D masks that refine into accurate zero-shot 3D instances.

citing papers explorer

Showing 3 of 3 citing papers.

SpaCeFormer: Fast Proposal-Free Open-Vocabulary 3D Instance Segmentation cs.CV · 2026-04-22 · unverdicted · none · ref 2
SpaCeFormer delivers 11.1 zero-shot mAP on ScanNet200 (2.8x prior proposal-free best) and runs 2-3 orders of magnitude faster than multi-stage 2D+3D pipelines by using spatial window attention and Morton-curve serialization to predict instance masks from learned queries.
Cross-Attentive Multiview Fusion of Vision-Language Embeddings cs.CV · 2026-04-14 · unverdicted · none · ref 13
CAMFusion fuses multiview 2D vision-language embeddings via cross-attention and multiview consistency self-supervision to produce better 3D semantic and instance representations, outperforming averaging and reaching SOTA on benchmarks including zero-shot out-of-domain cases.
MV3DIS: Multi-View Mask Matching via 3D Guides for Zero-Shot 3D Instance Segmentation cs.CV · 2026-04-10 · unverdicted · none · ref 20
MV3DIS uses 3D-guided mask matching and depth consistency to produce more consistent multi-view 2D masks that refine into accurate zero-shot 3D instances.

arXiv preprint arXiv:2507.23134 (2025) 9, 12, 13

fields

years

verdicts

representative citing papers

citing papers explorer