WorldMAP bootstraps reliable trajectory prediction in vision-language navigation by converting world-model-generated futures into structured supervision, cutting ADE by 18% and FDE by 42.1% on Target-Bench while making small VLMs competitive with large ones.
Aligning cyber space with physical world: A comprehensive survey on embodied ai,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Introduces a hierarchical VLN architecture with asynchronous layers, incremental memory graph, and WTRP-based exploration that improves success and efficiency on resource-constrained robots.
citing papers explorer
-
WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models
WorldMAP bootstraps reliable trajectory prediction in vision-language navigation by converting world-model-generated futures into structured supervision, cutting ADE by 18% and FDE by 42.1% on Target-Bench while making small VLMs competitive with large ones.
-
A Deployable Embodied Vision-Language Navigation System with Hierarchical Cognition and Context-Aware Exploration
Introduces a hierarchical VLN architecture with asynchronous layers, incremental memory graph, and WTRP-based exploration that improves success and efficiency on resource-constrained robots.