GeoX is a self-play RL framework in which a single multimodal policy proposes and solves spatial problems as executable programs over image primitives, using verifiable rewards to improve base VLMs by up to 5.5 points without large curated data.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SkyNative introduces an encoder-free architecture using raw patch tokens and modality-specific parameters in a unified autoregressive model to improve image-grounded reasoning in remote sensing vision-language tasks.
citing papers explorer
-
GeoX: Mastering Geospatial Reasoning Through Self-Play and Verifiable Rewards
GeoX is a self-play RL framework in which a single multimodal policy proposes and solves spatial problems as executable programs over image primitives, using verifiable rewards to improve base VLMs by up to 5.5 points without large curated data.
-
SkyNative: A Native Multimodal Framework for Remote Sensing Visual Evidence Reasoning
SkyNative introduces an encoder-free architecture using raw patch tokens and modality-specific parameters in a unified autoregressive model to improve image-grounded reasoning in remote sensing vision-language tasks.