RSRCC is a new 126k-question benchmark for fine-grained remote sensing change question-answering, constructed via a hierarchical semi-supervised pipeline with retrieval-augmented Best-of-N ranking.
Vrsbench: A versatile vision-language benchmark dataset for remote sensing image understanding
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3verdicts
UNVERDICTED 3representative citing papers
VRA is a training-free agentic framework that orchestrates off-the-shelf LVLMs with a reasoning model via iterative verification and refinement, raising accuracy on remote sensing VQA from 52.8% to 78.8% and delivering up to 40.67% gains on hard question types.
UniReason-Med introduces a unified framework for 2D and 3D medical VQA with shared grounded reasoning, trained on a 220K dataset, claiming that joint 2D+3D supervision improves 3D performance over 3D-only training.
citing papers explorer
-
RSRCC: A Remote Sensing Regional Change Comprehension Benchmark Constructed via Retrieval-Augmented Best-of-N Ranking
RSRCC is a new 126k-question benchmark for fine-grained remote sensing change question-answering, constructed via a hierarchical semi-supervised pipeline with retrieval-augmented Best-of-N ranking.
-
Visual Reasoning Agent: Robust Vision Systems in Remote Sensing via Inference-Time Scaling
VRA is a training-free agentic framework that orchestrates off-the-shelf LVLMs with a reasoning model via iterative verification and refinement, raising accuracy on remote sensing VQA from 52.8% to 78.8% and delivering up to 40.67% gains on hard question types.
-
UniReason-Med: A Shared Grounded Reasoning Interface for 2D-to-3D Transfer in Medical VQA
UniReason-Med introduces a unified framework for 2D and 3D medical VQA with shared grounded reasoning, trained on a 220K dataset, claiming that joint 2D+3D supervision improves 3D performance over 3D-only training.