DiTTA distills SAM2 temporal segmentation knowledge into image models via efficient test-time adaptation and a lightweight fusion module to produce annotation-free video semantic segmentation that matches or exceeds fully supervised performance.
Local memory attention for fast video semantic segmentation
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Bootstrapping Video Semantic Segmentation Model via Distillation-assisted Test-Time Adaptation
DiTTA distills SAM2 temporal segmentation knowledge into image models via efficient test-time adaptation and a lightweight fusion module to produce annotation-free video semantic segmentation that matches or exceeds fully supervised performance.