HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions

Alexander G Hauptmann; Fengyi Wu; Heng Li; Jingdong Sun; Lingdong Kong; Minghan Li; Qi Dai; Qi He; Yifei Dong; Yuxuan Zhou

arxiv: 2503.14229 · v4 · pith:X6U2KVDWnew · submitted 2025-03-18 · 💻 cs.AI · cs.CV· cs.RO

HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions

Yifei Dong , Fengyi Wu , Qi He , Lingdong Kong , Heng Li , Minghan Li , Zebang Cheng , Yuxuan Zhou

show 4 more authors

Jingdong Sun Qi Dai Alexander G Hauptmann Zhi-Qi Cheng

This is my paper

classification 💻 cs.AI cs.CVcs.RO

keywords navigationha-vlnbenchmarkcontinuousdiscretedynamicenvironmentsexplicit

0 comments

read the original abstract

Vision-and-Language Navigation (VLN) has been studied mainly in either discrete or continuous spaces, with little attention to dynamic, crowded environments. We present HA-VLN 2.0, a unified benchmark introducing explicit social-awareness constraints. Our contributions are: (i) a standardized task and metrics capturing both goal accuracy and personal-space adherence; (ii) HAPS 2.0 dataset and simulators modeling multi-human interactions, outdoor contexts, and finer language-motion alignment; (iii) benchmarks on 16,844 socially grounded instructions, revealing sharp performance drops of leading agents under human dynamics and partial observability; and (iv) real-world robot experiments validating sim-to-real transfer, with an open leaderboard enabling transparent comparison. Results show that explicit social modeling improves navigation robustness and reduces collisions, underscoring necessity of human-centric approaches. By releasing datasets, simulators, baselines, and protocols, HA-VLN 2.0 provides a strong foundation for safe, human-aware navigation research.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

HCSG: Human-Centric Semantic-Geometric Reasoning for Vision-Language Navigation
cs.RO 2026-05 unverdicted novelty 7.0

HCSG combines geometric forecasting of human pose and trajectory with VLM-generated semantic descriptions of intentions, fused into a topological map with a social distance loss, yielding 14% higher success rate and 3...
Beyond Isolation: A Unified Benchmark for General-Purpose Navigation
cs.RO 2026-05 unverdicted novelty 7.0

OmniNavBench is a unified benchmark for general-purpose navigation featuring composite multi-skill instructions, support for humanoid, quadrupedal and wheeled robots, and 1779 human teleoperated trajectories across 17...
Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification
cs.AI 2026-04 unverdicted novelty 7.0

Rule-VLN is the first large-scale benchmark injecting 177 regulatory categories into an urban environment, and the proposed SNRM module equips pre-trained VLN agents with zero-shot semantic reasoning and detour planni...
GoViG: Goal-Conditioned Visual Navigation Instruction Generation via Multimodal Reasoning
cs.CV 2025-08 unverdicted novelty 6.0

GoViG decomposes goal-conditioned navigation instruction generation into visual state prediction and instruction synthesis using an autoregressive multimodal LLM with one-pass and interleaved reasoning, showing gains ...
Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap
cs.RO 2026-04 unverdicted novelty 4.0

A survey of UAV vision-and-language navigation that establishes a methodological taxonomy, reviews resources and challenges, and proposes a forward-looking research roadmap.