Nav-r1: Reasoning and navigation in embodied scenes

Qingxiang Liu, Ting Huang, Zeyu Zhang, Hao Tang · 2025 · arXiv 2509.10884

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation

cs.RO · 2026-05-21 · unverdicted · novelty 7.0

AwareVLN introduces a structural reasoning module and automatic data engine with progress division to equip VLN agents with self-awareness of agent state and task progress, outperforming prior methods on Habitat datasets.

PlatonicNav: Unveiling Semantic Correspondence in Navigation with Platonic Topological Maps

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

PlatonicNav is a training-free framework using Platonic Topological Maps from a self-supervised visual encoder to unify vision-only ObjNav, cross-modal ObjNav, and VLN via blind matching on a shared semantic manifold.

Goal2Pixel: Grounding Goals to Pixels for Vision-Language Navigation

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

Goal2Pixel grounds VLN-CE goals to image pixels via VLM prediction plus keyframe memory, reaching 54.1% SR on R2R-CE Val-Unseen with 7.75 calls per episode versus 46.62 for action prediction.

SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation

cs.CV · 2026-04-30 · unverdicted · novelty 6.0

SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.

GeoWorld: Geometric World Models

cs.CV · 2026-02-26 · unverdicted · novelty 6.0

GeoWorld applies hyperbolic geometry to JEPA world models and introduces geometric reinforcement learning, reporting modest success-rate gains of ~3% and ~2% on 3- and 4-step planning tasks versus V-JEPA 2.

Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation

cs.RO · 2025-11-21 · unverdicted · novelty 6.0

Semantic progress reasoning predicts instruction-style advancement from visual history to guide policies, yielding state-of-the-art success and efficiency on R2R-CE and RxR-CE.

UniMesh: Unifying 3D Mesh Understanding and Generation

cs.CV · 2026-04-19 · unverdicted · novelty 5.0

UniMesh unifies 3D mesh generation and understanding in one model via a Mesh Head interface, Chain of Mesh iterative editing, and an Actor-Evaluator self-reflection loop.

citing papers explorer

Showing 5 of 5 citing papers after filters.

PlatonicNav: Unveiling Semantic Correspondence in Navigation with Platonic Topological Maps cs.CV · 2026-06-01 · unverdicted · none · ref 40
PlatonicNav is a training-free framework using Platonic Topological Maps from a self-supervised visual encoder to unify vision-only ObjNav, cross-modal ObjNav, and VLN via blind matching on a shared semantic manifold.
Goal2Pixel: Grounding Goals to Pixels for Vision-Language Navigation cs.CV · 2026-06-01 · unverdicted · none · ref 17
Goal2Pixel grounds VLN-CE goals to image pixels via VLM prediction plus keyframe memory, reaching 54.1% SR on R2R-CE Val-Unseen with 7.75 calls per episode versus 46.62 for action prediction.
SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation cs.CV · 2026-04-30 · unverdicted · none · ref 44
SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.
GeoWorld: Geometric World Models cs.CV · 2026-02-26 · unverdicted · none · ref 44
GeoWorld applies hyperbolic geometry to JEPA world models and introduces geometric reinforcement learning, reporting modest success-rate gains of ~3% and ~2% on 3- and 4-step planning tasks versus V-JEPA 2.
UniMesh: Unifying 3D Mesh Understanding and Generation cs.CV · 2026-04-19 · unverdicted · none · ref 25
UniMesh unifies 3D mesh generation and understanding in one model via a Mesh Head interface, Chain of Mesh iterative editing, and an Actor-Evaluator self-reflection loop.

Nav-r1: Reasoning and navigation in embodied scenes

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer