D2-CDIG conditions diffusion models on DEM and cloud-fog priors to generate controlled remote sensing images with decoupled terrain and atmospheric control.
Exploring models and data for remote sensing image caption generation,
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Sentinel2Cap provides human-annotated captions for multimodal Sentinel satellite images, with zero-shot tests showing RGB outperforming SAR and prompts helping performance.
JSSFF improves remote sensing image captioning by fusing structural edge details with semantic features in an encoder-decoder model and using fairness-based beam search, outperforming baselines on quantitative and qualitative measures.
citing papers explorer
-
D2-CDIG: Controlled Diffusion Remote Sensing Image Generation with Dual Priors of DEM and Cloud-Fog
D2-CDIG conditions diffusion models on DEM and cloud-fog priors to generate controlled remote sensing images with decoupled terrain and atmospheric control.
-
Sentinel2Cap: A Human-Annotated Benchmark Dataset for Multimodal Remote Sensing Image Captioning
Sentinel2Cap provides human-annotated captions for multimodal Sentinel satellite images, with zero-shot tests showing RGB outperforming SAR and prompts helping performance.
-
JSSFF: A Joint Structural-Semantic Fusion Framework for Remote Sensing Image Captioning
JSSFF improves remote sensing image captioning by fusing structural edge details with semantic features in an encoder-decoder model and using fairness-based beam search, outperforming baselines on quantitative and qualitative measures.