Hm3d-ovon: A dataset and benchmark for open-vocabulary object goal navigation

· 2024 · arXiv 2409.14296

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

LIME: Learning Intent-aware Camera Motion from Egocentric Video

cs.RO · 2026-07-02 · unverdicted · novelty 7.0

LIME formulates language-conditioned camera motion as predicting SE(3) target poses from RGB and intent text, using mined multi-intent supervision from egocentric video and a flow-matching pose head.

PInVerify: An Offline Embodied Benchmark for Active Instance Verification

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

PInVerify is a new offline embodied benchmark for active instance verification that supplies multi-view captures and 6-sector navigation topology, with MLLM baselines reaching 85.6% after fine-tuning but showing no reliable benefit from tested next-best-view strategies.

Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks

cs.RO · 2024-12-09 · unverdicted · novelty 6.0

Uni-NaVid unifies diverse embodied navigation tasks into one video-based vision-language-action model trained on 3.6 million samples from four sub-tasks, achieving state-of-the-art performance on benchmarks and real-world tests.

Qwen-RobotNav Technical Report: A Scalable Navigation Model Designed for an Agentic Navigation System

cs.RO · 2026-06-16 · unverdicted · novelty 5.0

Qwen-RobotNav provides a parameterized navigation model trained on 15.6M samples with vision-language co-training that achieves SOTA results on benchmarks and zero-shot transfer to real robots.

citing papers explorer

Showing 3 of 3 citing papers after filters.

LIME: Learning Intent-aware Camera Motion from Egocentric Video cs.RO · 2026-07-02 · unverdicted · none · ref 47
LIME formulates language-conditioned camera motion as predicting SE(3) target poses from RGB and intent text, using mined multi-intent supervision from egocentric video and a flow-matching pose head.
PInVerify: An Offline Embodied Benchmark for Active Instance Verification cs.CV · 2026-05-28 · unverdicted · none · ref 42
PInVerify is a new offline embodied benchmark for active instance verification that supplies multi-view captures and 6-sector navigation topology, with MLLM baselines reaching 85.6% after fine-tuning but showing no reliable benefit from tested next-best-view strategies.
Qwen-RobotNav Technical Report: A Scalable Navigation Model Designed for an Agentic Navigation System cs.RO · 2026-06-16 · unverdicted · none · ref 32
Qwen-RobotNav provides a parameterized navigation model trained on 15.6M samples with vision-language co-training that achieves SOTA results on benchmarks and zero-shot transfer to real robots.

Hm3d-ovon: A dataset and benchmark for open-vocabulary object goal navigation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer