A new diagnostic benchmark decomposes LLM spatial navigation into three cognitive scales and shows that cross-scale aggregation, not single-level deficits, causes failure beyond small mazes.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Introduces a trustworthiness-and-complexity switching metric that lets LLMs choose between language and grid modalities for spatial reasoning, yielding up to 42% gains in tested settings.
citing papers explorer
-
Lost in Aggregation: A Multi-Scale Diagnostic Benchmark for LLM Spatial Navigation
A new diagnostic benchmark decomposes LLM spatial navigation into three cognitive scales and shows that cross-scale aggregation, not single-level deficits, causes failure beyond small mazes.
-
Spatial Reasoning via Modality Switching Between Language and Symbolic Representation
Introduces a trustworthiness-and-complexity switching metric that lets LLMs choose between language and grid modalities for spatial reasoning, yielding up to 42% gains in tested settings.