Mask2former for video instance segmentation

Bowen Cheng, Anwesa Choudhuri, Ishan Misra, Alexander Kirillov, Rohit Girdhar, Alexander G Schwing · 2021 · arXiv 2112.10764

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

GOLD-BEV: GrOund and aeriaL Data for Dense Semantic BEV Mapping of Dynamic Scenes

cs.CV · 2026-04-21 · unverdicted · novelty 6.0

GOLD-BEV learns dense BEV semantic maps including dynamic agents from ego-centric sensors by using synchronized aerial imagery for training supervision and pseudo-label generation.

Primus: Enforcing Attention Usage for 3D Medical Image Segmentation

cs.CV · 2025-03-03 · unverdicted · novelty 6.0

Primus and PrimusV2 are Transformer-centric models that match or exceed nnU-Net and top CNNs on nine 3D medical segmentation datasets by enforcing attention usage.

PAT-VCM: Plug-and-Play Auxiliary Tokens for Video Coding for Machines

cs.CV · 2026-04-14 · unverdicted · novelty 5.0

PAT-VCM adds lightweight auxiliary tokens to a shared baseline video stream to support multiple downstream machine tasks without task-specific codecs.

citing papers explorer

Showing 3 of 3 citing papers.

GOLD-BEV: GrOund and aeriaL Data for Dense Semantic BEV Mapping of Dynamic Scenes cs.CV · 2026-04-21 · unverdicted · none · ref 7
GOLD-BEV learns dense BEV semantic maps including dynamic agents from ego-centric sensors by using synchronized aerial imagery for training supervision and pseudo-label generation.
Primus: Enforcing Attention Usage for 3D Medical Image Segmentation cs.CV · 2025-03-03 · unverdicted · none · ref 13
Primus and PrimusV2 are Transformer-centric models that match or exceed nnU-Net and top CNNs on nine 3D medical segmentation datasets by enforcing attention usage.
PAT-VCM: Plug-and-Play Auxiliary Tokens for Video Coding for Machines cs.CV · 2026-04-14 · unverdicted · none · ref 9
PAT-VCM adds lightweight auxiliary tokens to a shared baseline video stream to support multiple downstream machine tasks without task-specific codecs.

Mask2former for video instance segmentation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer