TagaVLM embeds topological structures into VLMs via residual attention and interleaved prompts, achieving 51.09% success rate on R2R unseen environments and outperforming prior large-model methods.
Navcot: Boosting llm-based vision-and-lang uage nav- igation via learning disentangled reasoning,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
TagaVLM: Topology-Aware Global Action Reasoning for Vision-Language Navigation
TagaVLM embeds topological structures into VLMs via residual attention and interleaved prompts, achieving 51.09% success rate on R2R unseen environments and outperforming prior large-model methods.