Rethinking remote sensing clip: Lever- aging multimodal large language models for high-quality vision-language dataset

Yiguo He, Junjie Zhu, Yiying Li, Qiangjuan Huang, Zhiyuan Wang, Ke Yang

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

SARVLM: A Vision Language Foundation Model for Semantic Understanding in SAR Imagery

cs.CV · 2025-10-26 · unverdicted · novelty 7.0

SARVLM is the first vision-language foundation model for SAR, trained via domain transfer on a 1M image-text dataset and outperforming prior models on 13 benchmarks for retrieval, recognition, detection, and captioning.

citing papers explorer

Showing 1 of 1 citing paper.

SARVLM: A Vision Language Foundation Model for Semantic Understanding in SAR Imagery cs.CV · 2025-10-26 · unverdicted · none · ref 10
SARVLM is the first vision-language foundation model for SAR, trained via domain transfer on a 1M image-text dataset and outperforming prior models on 13 benchmarks for retrieval, recognition, detection, and captioning.

Rethinking remote sensing clip: Lever- aging multimodal large language models for high-quality vision-language dataset

fields

years

verdicts

representative citing papers

citing papers explorer