Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments

Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko Sünderhauf, Ian Reid, Stephen Gould, Anton van den Hengel · 2018 · arXiv 2018.00387

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Lost in Aggregation: A Multi-Scale Diagnostic Benchmark for LLM Spatial Navigation

physics.soc-ph · 2026-06-20 · unverdicted · novelty 7.0

A new diagnostic benchmark decomposes LLM spatial navigation into three cognitive scales and shows that cross-scale aggregation, not single-level deficits, causes failure beyond small mazes.

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

cs.RO · 2025-07-02 · unverdicted · novelty 5.0

The survey frames VLA models as pipelines that generate progressively grounded action tokens and classifies those tokens into eight types to guide future development.

citing papers explorer

Showing 1 of 1 citing paper after filters.

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective cs.RO · 2025-07-02 · unverdicted · none · ref 240
The survey frames VLA models as pipelines that generate progressively grounded action tokens and classifies those tokens into eight types to guide future development.

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer