Ground slow, move fast: A dual-system foundation model for gener- alizable vision-and-language navigation

· 2025 · arXiv 2512.08186

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1 baseline 1 method 1

citation-polarity summary

background 1 baseline 1 use method 1

representative citing papers

Beyond Isolation: A Unified Benchmark for General-Purpose Navigation

cs.RO · 2026-05-10 · unverdicted · novelty 7.0

OmniNavBench is a unified benchmark for general-purpose navigation featuring composite multi-skill instructions, support for humanoid, quadrupedal and wheeled robots, and 1779 human teleoperated trajectories across 170 environments.

VLN-Cache: Enabling Token Caching for VLN Models with Visual/Semantic Dynamics Awareness

cs.RO · 2026-03-07 · conditional · novelty 7.0

VLN-Cache delivers up to 1.52x faster inference in VLN models by using view-aligned remapping for geometric consistency and a task-relevance saliency filter to manage semantic changes during navigation.

SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation

cs.CV · 2026-04-30 · unverdicted · novelty 6.0

SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.

ReMemNav: A Rethinking and Memory-Augmented Framework for Zero-Shot Object Navigation

cs.RO · 2026-03-25 · conditional · novelty 6.0

ReMemNav improves zero-shot object navigation success and efficiency by integrating episodic memory and rethinking with VLMs, achieving SR/SPL gains of 1.7%/7.0% on HM3D v0.1, 18.2%/11.1% on HM3D v0.2, and 8.7%/7.9% on MP3D.

Beyond Waypoints: Dual-Heatmap Grounding for Cross-Embodiment Semantic Navigation

cs.RO · 2026-05-19 · unverdicted · novelty 5.0

A vision-language model outputs dual heatmaps for navigation affordance and facing to ground semantic instructions into executable free space, achieving higher affordance rates than waypoint regression across simulated robot embodiments.

What Limits Vision-and-Language Navigation ?

cs.RO · 2026-05-13 · unverdicted · novelty 5.0

StereoNav reaches new benchmark highs on R2R-CE and RxR-CE and improves real-robot reliability by supplying persistent target-location priors and stereo-derived geometry that stay stable under lighting changes and blur.

Dual-Anchoring: Addressing State Drift in Vision-Language Navigation

cs.CV · 2026-04-19

citing papers explorer

Showing 1 of 1 citing paper after filters.

SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation cs.CV · 2026-04-30 · unverdicted · none · ref 69
SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.

Ground slow, move fast: A dual-system foundation model for gener- alizable vision-and-language navigation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer