OmniNavBench is a unified benchmark for general-purpose navigation featuring composite multi-skill instructions, support for humanoid, quadrupedal and wheeled robots, and 1779 human teleoperated trajectories across 170 environments.
Ground slow, move fast: A dual-system foundation model for gener- alizable vision-and-language navigation
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 7representative citing papers
VLN-Cache delivers up to 1.52x faster inference in VLN models by using view-aligned remapping for geometric consistency and a task-relevance saliency filter to manage semantic changes during navigation.
SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.
ReMemNav improves zero-shot object navigation success and efficiency by integrating episodic memory and rethinking with VLMs, achieving SR/SPL gains of 1.7%/7.0% on HM3D v0.1, 18.2%/11.1% on HM3D v0.2, and 8.7%/7.9% on MP3D.
A vision-language model outputs dual heatmaps for navigation affordance and facing to ground semantic instructions into executable free space, achieving higher affordance rates than waypoint regression across simulated robot embodiments.
StereoNav reaches new benchmark highs on R2R-CE and RxR-CE and improves real-robot reliability by supplying persistent target-location priors and stereo-derived geometry that stay stable under lighting changes and blur.
citing papers explorer
-
SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation
SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.