Geo-R1 uses indirect proxy rewards from cross-view alignment with geolocation metadata to drive reinforcement learning, enabling zero-shot geospatial reasoning that transfers across 25+ tasks and sometimes exceeds supervised specialists.
Geochain: Multimodal chain-of-thought for geographic reasoning.arXiv preprint arXiv:2506.00785, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2025 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Thinking LLMs achieve ~10 percentage points higher accuracy than non-thinking ones on RewardBench with under 2x compute overhead, outperforming augmentation strategies that cost over 8x more while also showing better bias robustness.
citing papers explorer
-
Unlocking Zero-Shot Geospatial Reasoning via Indirect Rewards
Geo-R1 uses indirect proxy rewards from cross-view alignment with geolocation metadata to drive reinforcement learning, enabling zero-shot geospatial reasoning that transfers across 25+ tasks and sometimes exceeds supervised specialists.
-
Explicit Reasoning Makes Better Judges: A Systematic Study on Accuracy, Efficiency, and Robustness
Thinking LLMs achieve ~10 percentage points higher accuracy than non-thinking ones on RewardBench with under 2x compute overhead, outperforming augmentation strategies that cost over 8x more while also showing better bias robustness.