Wmnav: Integrating vision-language models into world models for object goal navigation,

· 2025

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models

cs.AI · 2026-04-09 · unverdicted · novelty 7.0

WorldMAP bootstraps reliable trajectory prediction in vision-language navigation by converting world-model-generated futures into structured supervision, cutting ADE by 18% and FDE by 42.1% on Target-Bench while making small VLMs competitive with large ones.

ReMemNav: A Rethinking and Memory-Augmented Framework for Zero-Shot Object Navigation

cs.RO · 2026-03-25 · conditional · novelty 6.0

ReMemNav improves zero-shot object navigation success and efficiency by integrating episodic memory and rethinking with VLMs, achieving SR/SPL gains of 1.7%/7.0% on HM3D v0.1, 18.2%/11.1% on HM3D v0.2, and 8.7%/7.9% on MP3D.

A Deployable Embodied Vision-Language Navigation System with Hierarchical Cognition and Context-Aware Exploration

cs.RO · 2026-04-23 · unverdicted · novelty 4.0 · 2 refs

Introduces a hierarchical VLN architecture with asynchronous layers, incremental memory graph, and WTRP-based exploration that improves success and efficiency on resource-constrained robots.

citing papers explorer

Showing 3 of 3 citing papers.

WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models cs.AI · 2026-04-09 · unverdicted · none · ref 16
WorldMAP bootstraps reliable trajectory prediction in vision-language navigation by converting world-model-generated futures into structured supervision, cutting ADE by 18% and FDE by 42.1% on Target-Bench while making small VLMs competitive with large ones.
ReMemNav: A Rethinking and Memory-Augmented Framework for Zero-Shot Object Navigation cs.RO · 2026-03-25 · conditional · none · ref 31
ReMemNav improves zero-shot object navigation success and efficiency by integrating episodic memory and rethinking with VLMs, achieving SR/SPL gains of 1.7%/7.0% on HM3D v0.1, 18.2%/11.1% on HM3D v0.2, and 8.7%/7.9% on MP3D.
A Deployable Embodied Vision-Language Navigation System with Hierarchical Cognition and Context-Aware Exploration cs.RO · 2026-04-23 · unverdicted · none · ref 29 · 2 links
Introduces a hierarchical VLN architecture with asynchronous layers, incremental memory graph, and WTRP-based exploration that improves success and efficiency on resource-constrained robots.

Wmnav: Integrating vision-language models into world models for object goal navigation,

fields

years

verdicts

representative citing papers

citing papers explorer