MapReason-OSM supplies 6000 graph-verifiable instances across 12 mobility tasks on rendered OSM maps from 10 U.S. downtowns and shows that seven VLMs succeed at simple routing but perform near chance on cost-based facility placement and cross-zoom consistency.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
GeoNatureAgent Benchmark tests seven LLMs on 93 tasks via a production geospatial API, with Claude Sonnet 4 at 60.8% and DeepSeek V3.2 offering near performance at 11x lower cost while all models fail on close-value comparisons.
SLM adds a dedicated spatial modality and training dataset to LLMs, enabling geometric spatial reasoning and outperforming prompt-based symbolic methods on the new SpatialEval benchmark.
citing papers explorer
-
MapReason-OSM: Can Vision-Language Models Make Graph-Verifiable Mobility Decisions from Street Maps ?
MapReason-OSM supplies 6000 graph-verifiable instances across 12 mobility tasks on rendered OSM maps from 10 U.S. downtowns and shows that seven VLMs succeed at simple routing but perform near chance on cost-based facility placement and cross-zoom consistency.