To- wards faithful reasoning in remote sensing: A perceptually- grounded geospatial chain-of-thought for vision-language models

Towards Faithful Reasoning in Remote Sensing: A Perceptually-Grounded GeoSpatial Chain-of-Thought for Vision-Language Models , author= · 2025 · arXiv 2509.22221

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Earth-OneVision: Extending Remote Sensing Multimodal Large Language Models to More Sensor Modalities and Tasks

cs.CV · 2026-06-09 · unverdicted · novelty 7.0

Earth-OneVision is a unified 2B-parameter RS-MLLM supporting six modalities and nine tasks via FGVLA, SLIS, and PCMA mechanisms plus a 34M QA-pair dataset, reporting competitive or superior benchmark results versus larger models.

RemoteAgent: Bridging Vague Human Intents and Earth Observation with RL-based Agentic MLLMs

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

RemoteAgent uses RL fine-tuning on VagueEO to align MLLMs for vague EO intent recognition, handling simple tasks internally and routing dense predictions to tools via Model Context Protocol.

GeoSearcher: Anchor-Guided Progressive Reasoning for Remote Sensing Visual Grounding with Process Supervision

cs.CV · 2026-07-01 · unverdicted · novelty 6.0

GeoSearcher introduces anchor-centric reasoning supervised fine-tuning and process-faithful group relative policy optimization to improve MLLM-based remote sensing visual grounding.

RemoteZero: Geospatial Reasoning with Zero Human Annotations

cs.CV · 2026-05-06 · unverdicted · novelty 6.0

RemoteZero replaces coordinate supervision with intrinsic semantic verification to enable box-free GRPO training and self-evolution for geospatial reasoning.

RemoteShield: Enable Robust Multimodal Large Language Models for Earth Observation

cs.CV · 2026-04-19 · unverdicted · novelty 6.0

RemoteShield improves robustness of Earth observation MLLMs by training on semantic equivalence clusters of clean and perturbed inputs via preference learning to maintain consistent reasoning under noise.

UniReason-Med: A Shared Grounded Reasoning Interface for 2D-to-3D Transfer in Medical VQA

cs.CV · 2026-06-10 · unverdicted · novelty 4.0

UniReason-Med introduces a unified framework for 2D and 3D medical VQA with shared grounded reasoning, trained on a 220K dataset, claiming that joint 2D+3D supervision improves 3D performance over 3D-only training.

citing papers explorer

Showing 2 of 2 citing papers after filters.

RemoteAgent: Bridging Vague Human Intents and Earth Observation with RL-based Agentic MLLMs cs.CV · 2026-04-09 · unverdicted · none · ref 33
RemoteAgent uses RL fine-tuning on VagueEO to align MLLMs for vague EO intent recognition, handling simple tasks internally and routing dense predictions to tools via Model Context Protocol.
RemoteShield: Enable Robust Multimodal Large Language Models for Earth Observation cs.CV · 2026-04-19 · unverdicted · none · ref 31
RemoteShield improves robustness of Earth observation MLLMs by training on semantic equivalence clusters of clean and perturbed inputs via preference learning to maintain consistent reasoning under noise.

To- wards faithful reasoning in remote sensing: A perceptually- grounded geospatial chain-of-thought for vision-language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer