Earthdial: Turning multi-sensory earth observations to interactive dialogues

Sagar Soni, Akshay Dudhane, Hiyam Debary, Mustansar Fiaz, Muhammad Akhtar Munir, Muhammad Sohail Danish, Paolo Fraccaro, Campbell D Watson, Levente J Klein, Fahad Shahbaz Khan, et al · 2025

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

citation-role summary

baseline 2

citation-polarity summary

baseline 2

representative citing papers

SenseBench: A Benchmark for Remote Sensing Low-Level Visual Perception and Description in Large Vision-Language Models

cs.CV · 2026-05-11 · unverdicted · novelty 8.0

SenseBench is the first physics-based benchmark with 10K+ instances and dual protocols to evaluate VLMs on remote sensing low-level perception and diagnostic description, revealing domain bias and specific failure modes.

GeoX: Mastering Geospatial Reasoning Through Self-Play and Verifiable Rewards

cs.AI · 2026-05-19 · unverdicted · novelty 7.0

GeoX is a self-play RL framework in which a single multimodal policy proposes and solves spatial problems as executable programs over image primitives, using verifiable rewards to improve base VLMs by up to 5.5 points without large curated data.

GeoVista: Visually Grounded Active Perception for Ultra-High-Resolution Remote Sensing Understanding

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

GeoVista introduces a planning-driven active perception framework with global exploration plans, branch-wise local inspection, and explicit evidence tracking to achieve state-of-the-art results on ultra-high-resolution remote sensing benchmarks.

SkyNative: A Native Multimodal Framework for Remote Sensing Visual Evidence Reasoning

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

SkyNative introduces an encoder-free architecture using raw patch tokens and modality-specific parameters in a unified autoregressive model to improve image-grounded reasoning in remote sensing vision-language tasks.

No One Knows the State of the Art in Geospatial Foundation Models

cs.CV · 2026-05-12 · accept · novelty 6.0

An audit of 152 papers reveals that geospatial foundation models lack standardized evaluations, training controls, and weight releases, so no one knows the state of the art.

citing papers explorer

Showing 5 of 5 citing papers.

SenseBench: A Benchmark for Remote Sensing Low-Level Visual Perception and Description in Large Vision-Language Models cs.CV · 2026-05-11 · unverdicted · none · ref 37
SenseBench is the first physics-based benchmark with 10K+ instances and dual protocols to evaluate VLMs on remote sensing low-level perception and diagnostic description, revealing domain bias and specific failure modes.
GeoX: Mastering Geospatial Reasoning Through Self-Play and Verifiable Rewards cs.AI · 2026-05-19 · unverdicted · none · ref 22
GeoX is a self-play RL framework in which a single multimodal policy proposes and solves spatial problems as executable programs over image primitives, using verifiable rewards to improve base VLMs by up to 5.5 points without large curated data.
GeoVista: Visually Grounded Active Perception for Ultra-High-Resolution Remote Sensing Understanding cs.CV · 2026-05-14 · unverdicted · none · ref 40
GeoVista introduces a planning-driven active perception framework with global exploration plans, branch-wise local inspection, and explicit evidence tracking to achieve state-of-the-art results on ultra-high-resolution remote sensing benchmarks.
SkyNative: A Native Multimodal Framework for Remote Sensing Visual Evidence Reasoning cs.CV · 2026-05-18 · unverdicted · none · ref 34
SkyNative introduces an encoder-free architecture using raw patch tokens and modality-specific parameters in a unified autoregressive model to improve image-grounded reasoning in remote sensing vision-language tasks.
No One Knows the State of the Art in Geospatial Foundation Models cs.CV · 2026-05-12 · accept · none · ref 63
An audit of 152 papers reveals that geospatial foundation models lack standardized evaluations, training controls, and weight releases, so no one knows the state of the art.

Earthdial: Turning multi-sensory earth observations to interactive dialogues

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer