AeroRAG improves fine-grained aerial visual question answering by converting images to scene graphs and using retrieval-augmented generation to create compact LLM prompts.
Ofa: Uni- fying architectures, tasks, and modalities through a simple sequence- to-sequence learning framework
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
AeroRAG: Structured Multimodal Retrieval-Augmented LLM for Fine-Grained Aerial Visual Reasoning
AeroRAG improves fine-grained aerial visual question answering by converting images to scene graphs and using retrieval-augmented generation to create compact LLM prompts.