pith. sign in

arxiv: 2503.14229 · v4 · pith:X6U2KVDWnew · submitted 2025-03-18 · 💻 cs.AI · cs.CV· cs.RO

HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions

classification 💻 cs.AI cs.CVcs.RO
keywords navigationha-vlnbenchmarkcontinuousdiscretedynamicenvironmentsexplicit
0
0 comments X
read the original abstract

Vision-and-Language Navigation (VLN) has been studied mainly in either discrete or continuous spaces, with little attention to dynamic, crowded environments. We present HA-VLN 2.0, a unified benchmark introducing explicit social-awareness constraints. Our contributions are: (i) a standardized task and metrics capturing both goal accuracy and personal-space adherence; (ii) HAPS 2.0 dataset and simulators modeling multi-human interactions, outdoor contexts, and finer language-motion alignment; (iii) benchmarks on 16,844 socially grounded instructions, revealing sharp performance drops of leading agents under human dynamics and partial observability; and (iv) real-world robot experiments validating sim-to-real transfer, with an open leaderboard enabling transparent comparison. Results show that explicit social modeling improves navigation robustness and reduces collisions, underscoring necessity of human-centric approaches. By releasing datasets, simulators, baselines, and protocols, HA-VLN 2.0 provides a strong foundation for safe, human-aware navigation research.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. HCSG: Human-Centric Semantic-Geometric Reasoning for Vision-Language Navigation

    cs.RO 2026-05 unverdicted novelty 7.0

    HCSG combines geometric forecasting of human pose and trajectory with VLM-generated semantic descriptions of intentions, fused into a topological map with a social distance loss, yielding 14% higher success rate and 3...

  2. Beyond Isolation: A Unified Benchmark for General-Purpose Navigation

    cs.RO 2026-05 unverdicted novelty 7.0

    OmniNavBench is a unified benchmark for general-purpose navigation featuring composite multi-skill instructions, support for humanoid, quadrupedal and wheeled robots, and 1779 human teleoperated trajectories across 17...

  3. Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification

    cs.AI 2026-04 unverdicted novelty 7.0

    Rule-VLN is the first large-scale benchmark injecting 177 regulatory categories into an urban environment, and the proposed SNRM module equips pre-trained VLN agents with zero-shot semantic reasoning and detour planni...

  4. GoViG: Goal-Conditioned Visual Navigation Instruction Generation via Multimodal Reasoning

    cs.CV 2025-08 unverdicted novelty 6.0

    GoViG decomposes goal-conditioned navigation instruction generation into visual state prediction and instruction synthesis using an autoregressive multimodal LLM with one-pass and interleaved reasoning, showing gains ...

  5. Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap

    cs.RO 2026-04 unverdicted novelty 4.0

    A survey of UAV vision-and-language navigation that establishes a methodological taxonomy, reviews resources and challenges, and proposes a forward-looking research roadmap.