A vision-language grounded framework generates and evaluates synthetic remote sensing data, releasing ARAS400k where augmented training outperforms real-data baselines for segmentation and captioning.
Nwpu-captions dataset and mlca-net for remote sensing image captioning.IEEE Trans- actions on Geoscience and Remote Sensing, 60:1–19, 2022
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Grounding Synthetic Data Generation With Vision and Language Models
A vision-language grounded framework generates and evaluates synthetic remote sensing data, releasing ARAS400k where augmented training outperforms real-data baselines for segmentation and captioning.