TSMNet uses a dual-branch text encoder and text-guided fusion module to integrate scene-level semantic and object-level label features from text with visual embeddings, achieving superior open-vocabulary segmentation on new multimodal remote sensing datasets.
One of them is the semantic segmentation dataset of optical and SAR remote sensing images from Gaofen (GF) satellites, and the images are described manually
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Open-Vocabulary Semantic Segmentation Network Integrating Object-Level Label and Scene-Level Semantic Features for Multimodal Remote Sensing Images
TSMNet uses a dual-branch text encoder and text-guided fusion module to integrate scene-level semantic and object-level label features from text with visual embeddings, achieving superior open-vocabulary segmentation on new multimodal remote sensing datasets.