Introduces the SMART-HC-VQA dataset with 65k single-image and 2.3M temporal VQA examples plus an adapted LLaVA-NeXT MLLM framework for geospatial-temporal sensemaking of remote sensing construction activity.
BTCChat: Advancing remote sensing bi -temporal change captioning with multimodal large language model
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
PTNet is a prototype-guided task-adaptive model that jointly performs change detection and captioning on bi-temporal UAV imagery by modeling structured change semantics, outperforming prior methods on the new UCCD urban construction benchmark and WHU-CDC.
Delta-LLaVA adds Change-Enhanced Attention, Change-SEG with prior embeddings, and Local Causal Attention to MLLMs to overcome temporal blindness, outperforming general models on a new unified benchmark for bi- and tri-temporal remote sensing tasks.
citing papers explorer
-
Geospatial-Temporal Sensemaking of Remote Sensing Activity Detections with Multimodal Large Language Model
Introduces the SMART-HC-VQA dataset with 65k single-image and 2.3M temporal VQA examples plus an adapted LLaVA-NeXT MLLM framework for geospatial-temporal sensemaking of remote sensing construction activity.
-
UAV as Urban Construction Change Monitor: A New Benchmark and Change Captioning Model
PTNet is a prototype-guided task-adaptive model that jointly performs change detection and captioning on bi-temporal UAV imagery by modeling structured change semantics, outperforming prior methods on the new UCCD urban construction benchmark and WHU-CDC.
-
Decoding the Delta: Unifying Remote Sensing Change Detection and Understanding with Multimodal Large Language Models
Delta-LLaVA adds Change-Enhanced Attention, Change-SEG with prior embeddings, and Local Causal Attention to MLLMs to overcome temporal blindness, outperforming general models on a new unified benchmark for bi- and tri-temporal remote sensing tasks.