Where on earth? a vision-language bench- mark for probing model geolocation skills across scales

Zhaofang Qian, Hardy Chen, Zeyu Wang, Li Zhang, Zijun Wang, Xiaoke Huang, Hui Liu, Xianfeng Tang, Zeyu Zheng, Haoqin Tu, et al · 2025 · arXiv 2510.10880

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Skill-Conditioned Visual Geolocation for Vision-Language Models

cs.CV · 2026-04-10 · unverdicted · novelty 7.0 · 2 refs

GeoSkill lets vision-language models improve geolocation accuracy and reasoning by maintaining an evolving Skill-Graph that grows through autonomous analysis of successful and failed rollouts on web-scale image data.

Where Do Vision-Language Models Fail? World Scale Analysis for Image Geolocalization

cs.CV · 2026-04-17 · unverdicted · novelty 6.0

Vision-language models display large performance differences and clear limits in zero-shot country-level geolocalization from ground-view photos, with semantic cues helping coarse guesses but failing on fine details.

citing papers explorer

Showing 2 of 2 citing papers.

Skill-Conditioned Visual Geolocation for Vision-Language Models cs.CV · 2026-04-10 · unverdicted · none · ref 22 · 2 links
GeoSkill lets vision-language models improve geolocation accuracy and reasoning by maintaining an evolving Skill-Graph that grows through autonomous analysis of successful and failed rollouts on web-scale image data.
Where Do Vision-Language Models Fail? World Scale Analysis for Image Geolocalization cs.CV · 2026-04-17 · unverdicted · none · ref 29
Vision-language models display large performance differences and clear limits in zero-shot country-level geolocalization from ground-view photos, with semantic cues helping coarse guesses but failing on fine details.

Where on earth? a vision-language bench- mark for probing model geolocation skills across scales

fields

years

verdicts

representative citing papers

citing papers explorer