PairGS builds a relation graph from sparse pairwise affinities on 3D Gaussians to achieve SOTA open-vocabulary segmentation with a 50x faster variant than optimization-based methods.
arXiv preprint arXiv:2404.03650 (2024)
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 4years
2026 4verdicts
UNVERDICTED 4roles
background 2polarities
background 2representative citing papers
3AM integrates MUSt3R 3D features into SAM2 via a Feature Merger and FOV-aware sampling to deliver geometry-consistent video object segmentation from RGB alone, with large gains on wide-baseline datasets.
EPS3D is an end-to-end architecture for 3D panoptic segmentation from multi-view images that uses distillation and semantic-instance mutual enhancement to achieve higher benchmark performance and speed than prior methods.
ClickSeg3D uses a point Transformer encoder and hierarchical mask decoder with semantic embeddings to enable single-pass multi-object 3D interactive segmentation from sparse points, reporting over 20% mIoU gains versus baselines and 8-10% cross-dataset improvements with one click per instance.
citing papers explorer
-
Relation-Centric Open-Vocabulary 3D Gaussian Segmentation
PairGS builds a relation graph from sparse pairwise affinities on 3D Gaussians to achieve SOTA open-vocabulary segmentation with a 50x faster variant than optimization-based methods.
-
3AM: 3egment Anything with Geometric Consistency in Videos
3AM integrates MUSt3R 3D features into SAM2 via a Feature Merger and FOV-aware sampling to deliver geometry-consistent video object segmentation from RGB alone, with large gains on wide-baseline datasets.
-
EPS3D: End-to-End Feed-Forward 3D Panoptic Segmentation
EPS3D is an end-to-end architecture for 3D panoptic segmentation from multi-view images that uses distillation and semantic-instance mutual enhancement to achieve higher benchmark performance and speed than prior methods.
-
ClickSeg3D: Few-Click Interactive Segmentation via Semantic Embeddings
ClickSeg3D uses a point Transformer encoder and hierarchical mask decoder with semantic embeddings to enable single-pass multi-object 3D interactive segmentation from sparse points, reporting over 20% mIoU gains versus baselines and 8-10% cross-dataset improvements with one click per instance.