GA-VLN builds a geometry-aware BEV representation from RGB-D inputs plus 3D foundation model features to deliver state-of-the-art vision-language navigation using only navigation data.
Weakly- supervised multi-granularity map learning for vision-and- language navigation.Advances in Neural Information Pro- cessing Systems, 35:38149–38161
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GA-VLN: Geometry-Aware BEV Representation for Efficient Vision-Language Navigation
GA-VLN builds a geometry-aware BEV representation from RGB-D inputs plus 3D foundation model features to deliver state-of-the-art vision-language navigation using only navigation data.