Reasoning-tuned LLMs reliably complete navigation in partial-observability gridworlds but take longer paths than oracle optima, with few-shot prompting reducing invalid moves and action priors like UP/RIGHT causing loops.
In:Advances in Neural Information Processing Systems (NeurIPS 2020), 1877–1901 (2020).https://papers.nips.cc/paper/2020/hash/1457c0d6b fcb4967418bfb8ac142f64a-Abstract.html
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
LLMs for Text-Based Exploration and Navigation Under Partial Observability
Reasoning-tuned LLMs reliably complete navigation in partial-observability gridworlds but take longer paths than oracle optima, with few-shot prompting reducing invalid moves and action priors like UP/RIGHT causing loops.