VidTAG achieves fine-grained global video-to-GPS geolocalization via temporal frame alignment and denoising sequence refinement, reporting 20% gains at 1 km over GeoCLIP and 25% on CityGuessr68k.
Georanker: Distance-aware ranking for worldwide image geolocalization
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
GeoSkill lets vision-language models improve geolocation accuracy and reasoning by maintaining an evolving Skill-Graph that grows through autonomous analysis of successful and failed rollouts on web-scale image data.
citing papers explorer
-
VidTAG: Temporally Aligned Video to GPS Geolocalization with Denoising Sequence Prediction at a Global Scale
VidTAG achieves fine-grained global video-to-GPS geolocalization via temporal frame alignment and denoising sequence refinement, reporting 20% gains at 1 km over GeoCLIP and 25% on CityGuessr68k.
-
Skill-Conditioned Visual Geolocation for Vision-Language Models
GeoSkill lets vision-language models improve geolocation accuracy and reasoning by maintaining an evolving Skill-Graph that grows through autonomous analysis of successful and failed rollouts on web-scale image data.