Three-Step Nav uses a three-view MLLM protocol to achieve state-of-the-art zero-shot VLN performance on R2R-CE and RxR-CE by global planning, local alignment, and trajectory auditing.
[Yes] (b) All the training details (e.g., data splits, hy- perparameters, how they were chosen)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Three-Step Nav: A Hierarchical Global-Local Planner for Zero-Shot Vision-and-Language Navigation
Three-Step Nav uses a three-view MLLM protocol to achieve state-of-the-art zero-shot VLN performance on R2R-CE and RxR-CE by global planning, local alignment, and trajectory auditing.