Earth-OneVision is a unified 2B-parameter RS-MLLM supporting six modalities and nine tasks via FGVLA, SLIS, and PCMA mechanisms plus a 34M QA-pair dataset, reporting competitive or superior benchmark results versus larger models.
Multi-resolution sar and optical remote sensing image registration methods: A review, datasets, and future perspectives
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 4years
2026 4roles
background 1polarities
background 1representative citing papers
MetaEarth-MM unifies multi-modal remote sensing image generation and any-to-any translation across five modalities via scene-centered joint modeling on the new EarthMM dataset.
TAR uses frozen text encoders on remote sensing scene descriptions to boost high-level features for coarse-to-fine optical-SAR image registration under large deformations.
UniReason-Med introduces a unified framework for 2D and 3D medical VQA with shared grounded reasoning, trained on a 220K dataset, claiming that joint 2D+3D supervision improves 3D performance over 3D-only training.
citing papers explorer
-
Earth-OneVision: Extending Remote Sensing Multimodal Large Language Models to More Sensor Modalities and Tasks
Earth-OneVision is a unified 2B-parameter RS-MLLM supporting six modalities and nine tasks via FGVLA, SLIS, and PCMA mechanisms plus a 34M QA-pair dataset, reporting competitive or superior benchmark results versus larger models.
-
MetaEarth-MM: Unified Multimodal Remote Sensing Image Generation with Scene-centered Joint Modeling
MetaEarth-MM unifies multi-modal remote sensing image generation and any-to-any translation across five modalities via scene-centered joint modeling on the new EarthMM dataset.
-
TAR: Text Semantic Assisted Cross-modal Image Registration Framework for Optical and SAR Images
TAR uses frozen text encoders on remote sensing scene descriptions to boost high-level features for coarse-to-fine optical-SAR image registration under large deformations.
-
UniReason-Med: A Shared Grounded Reasoning Interface for 2D-to-3D Transfer in Medical VQA
UniReason-Med introduces a unified framework for 2D and 3D medical VQA with shared grounded reasoning, trained on a 220K dataset, claiming that joint 2D+3D supervision improves 3D performance over 3D-only training.