Segment any- thing

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al · 2023

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

BEA-GS: BEyond RAdiance Supervision in 3DGS for Precise Object Extraction

cs.CV · 2026-05-10 · unverdicted · novelty 7.0

BEA-GS achieves superior object boundary segmentation in 3D Gaussian Splatting by introducing two new losses that adjust geometry of visible and non-visible Gaussians based on semantics.

Space-Time Forecasting of Dynamic Scenes with Motion-aware Gaussian Grouping

cs.CV · 2026-02-25 · unverdicted · novelty 7.0

MoGaF groups Gaussians by motion in 4D splatting representations to enable stable long-term forecasting of dynamic scenes.

EmoVerse: A MLLMs-Driven Emotion Representation Dataset for Interpretable Visual Emotion Analysis

cs.CV · 2025-11-16 · unverdicted · novelty 7.0

EmoVerse is a large open-source dataset enabling interpretable visual emotion analysis via B-A-S triplets, region grounding, and unified CES/DES representations created through an MLLM-driven pipeline.

Selective, Regularized, and Calibrated: Harnessing Vision Foundation Models for Cross-Domain Few-Shot Semantic Segmentation

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

HERA is a select-regularize-calibrate framework adapting frozen vision foundation models for cross-domain few-shot semantic segmentation via hierarchical layer selection with ETR, prior-guided regularization, and pixelwise adaptive calibration, reporting over 4.1 mIoU gains.

Chorus: Multi-Teacher Pretraining for Holistic 3D Gaussian Scene Encoding

cs.CV · 2025-12-19 · unverdicted · novelty 6.0

Chorus pretrains a shared 3D Gaussian scene encoder via multi-teacher distillation to capture holistic features from high-level semantics to fine-grained structure, with strong transfer on segmentation and point-cloud tasks using far fewer scenes.

Hard to See, Hard to Label: Generative and Symbolic Acquisition for Subtle Visual Phenomena

cs.CV · 2026-04-24 · unverdicted · novelty 5.0

GSAL combines diffusion-based visual difficulty scoring with hierarchical semantic coverage to improve active learning retrieval of subtle and rare visual anomalies over standard uncertainty and diversity methods.

Long Story Short: Disentangling Compositionality and Long-Caption Understanding in Contrastive VLMs

cs.CV · 2025-09-23 · unverdicted · novelty 5.0

Empirical study shows bidirectional but sensitive relationship between compositionality and long-caption understanding in VLMs, promoted by high-quality grounded data and affected by architectural choices like frozen positional embeddings.

citing papers explorer

Showing 7 of 7 citing papers.

BEA-GS: BEyond RAdiance Supervision in 3DGS for Precise Object Extraction cs.CV · 2026-05-10 · unverdicted · none · ref 24
BEA-GS achieves superior object boundary segmentation in 3D Gaussian Splatting by introducing two new losses that adjust geometry of visible and non-visible Gaussians based on semantics.
Space-Time Forecasting of Dynamic Scenes with Motion-aware Gaussian Grouping cs.CV · 2026-02-25 · unverdicted · none · ref 17
MoGaF groups Gaussians by motion in 4D splatting representations to enable stable long-term forecasting of dynamic scenes.
EmoVerse: A MLLMs-Driven Emotion Representation Dataset for Interpretable Visual Emotion Analysis cs.CV · 2025-11-16 · unverdicted · none · ref 17
EmoVerse is a large open-source dataset enabling interpretable visual emotion analysis via B-A-S triplets, region grounding, and unified CES/DES representations created through an MLLM-driven pipeline.
Selective, Regularized, and Calibrated: Harnessing Vision Foundation Models for Cross-Domain Few-Shot Semantic Segmentation cs.CV · 2026-05-19 · unverdicted · none · ref 27
HERA is a select-regularize-calibrate framework adapting frozen vision foundation models for cross-domain few-shot semantic segmentation via hierarchical layer selection with ETR, prior-guided regularization, and pixelwise adaptive calibration, reporting over 4.1 mIoU gains.
Chorus: Multi-Teacher Pretraining for Holistic 3D Gaussian Scene Encoding cs.CV · 2025-12-19 · unverdicted · none · ref 26
Chorus pretrains a shared 3D Gaussian scene encoder via multi-teacher distillation to capture holistic features from high-level semantics to fine-grained structure, with strong transfer on segmentation and point-cloud tasks using far fewer scenes.
Hard to See, Hard to Label: Generative and Symbolic Acquisition for Subtle Visual Phenomena cs.CV · 2026-04-24 · unverdicted · none · ref 12
GSAL combines diffusion-based visual difficulty scoring with hierarchical semantic coverage to improve active learning retrieval of subtle and rare visual anomalies over standard uncertainty and diversity methods.
Long Story Short: Disentangling Compositionality and Long-Caption Understanding in Contrastive VLMs cs.CV · 2025-09-23 · unverdicted · none · ref 19
Empirical study shows bidirectional but sensitive relationship between compositionality and long-caption understanding in VLMs, promoted by high-quality grounded data and affected by architectural choices like frozen positional embeddings.

Segment any- thing

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer