Aerial vision- and-language navigation via semantic-topo-metric representation guided llm reasoning

· 2024 · arXiv 2410.08500

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 2 baseline 1 method 1

citation-polarity summary

background 3 baseline 1

representative citing papers

LookasideVLN: Direction-Aware Aerial Vision-and-Language Navigation

cs.CV · 2026-04-19 · unverdicted · novelty 7.0

LookasideVLN improves aerial vision-and-language navigation by encoding directional cues from instructions into an egocentric graph and lightweight knowledge base, outperforming prior methods like CityNavAgent even with single-step lookahead.

How Far Are Large Multimodal Models from Human-Level Spatial Action? A Benchmark for Goal-Oriented Embodied Navigation in Urban Airspace

cs.AI · 2026-04-09 · unverdicted · novelty 7.0

Large multimodal models display emerging but limited spatial action capabilities in goal-oriented urban 3D navigation, remaining far from human-level performance with errors diverging rapidly after critical decision points.

FineCog-Nav: Integrating Fine-grained Cognitive Modules for Zero-shot Multimodal UAV Navigation

cs.CV · 2026-04-17 · unverdicted · novelty 6.0

FineCog-Nav uses fine-grained cognitive modules driven by foundation models to outperform zero-shot baselines in UAV navigation and introduces the AerialVLN-Fine benchmark with refined instructions.

HTNav: A Hybrid Navigation Framework with Tiered Structure for Urban Aerial Vision-and-Language Navigation

cs.RO · 2026-04-10 · unverdicted · novelty 6.0

HTNav combines imitation and reinforcement learning in a staged, tiered structure with map learning to reach state-of-the-art performance on the CityNav benchmark for urban aerial navigation.

Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

cs.CV · 2025-12-09 · unverdicted · novelty 6.0

A monocular RGB-only aerial VLN framework outperforms baselines via prompt-guided multi-task learning, keyframe selection, and label reweighting on AerialVLN and OpenFly benchmarks.

Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap

cs.RO · 2026-04-15 · unverdicted · novelty 4.0

A survey of UAV vision-and-language navigation that establishes a methodological taxonomy, reviews resources and challenges, and proposes a forward-looking research roadmap.

Vision-Language Navigation for Aerial Robots: Towards the Era of Large Language Models

cs.RO · 2026-04-09 · unverdicted · novelty 4.0

This survey organizes aerial vision-language navigation methods into five architectural categories, critically reviews evaluation infrastructure, and synthesizes seven open problems for LLM/VLM integration.

citing papers explorer

Showing 7 of 7 citing papers.

LookasideVLN: Direction-Aware Aerial Vision-and-Language Navigation cs.CV · 2026-04-19 · unverdicted · none · ref 14
LookasideVLN improves aerial vision-and-language navigation by encoding directional cues from instructions into an egocentric graph and lightweight knowledge base, outperforming prior methods like CityNavAgent even with single-step lookahead.
How Far Are Large Multimodal Models from Human-Level Spatial Action? A Benchmark for Goal-Oriented Embodied Navigation in Urban Airspace cs.AI · 2026-04-09 · unverdicted · none · ref 15
Large multimodal models display emerging but limited spatial action capabilities in goal-oriented urban 3D navigation, remaining far from human-level performance with errors diverging rapidly after critical decision points.
FineCog-Nav: Integrating Fine-grained Cognitive Modules for Zero-shot Multimodal UAV Navigation cs.CV · 2026-04-17 · unverdicted · none · ref 11
FineCog-Nav uses fine-grained cognitive modules driven by foundation models to outperform zero-shot baselines in UAV navigation and introduces the AerialVLN-Fine benchmark with refined instructions.
HTNav: A Hybrid Navigation Framework with Tiered Structure for Urban Aerial Vision-and-Language Navigation cs.RO · 2026-04-10 · unverdicted · none · ref 10
HTNav combines imitation and reinforcement learning in a staged, tiered structure with map learning to reach state-of-the-art performance on the CityNav benchmark for urban aerial navigation.
Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning cs.CV · 2025-12-09 · unverdicted · none · ref 14
A monocular RGB-only aerial VLN framework outperforms baselines via prompt-guided multi-task learning, keyframe selection, and label reweighting on AerialVLN and OpenFly benchmarks.
Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap cs.RO · 2026-04-15 · unverdicted · none · ref 294
A survey of UAV vision-and-language navigation that establishes a methodological taxonomy, reviews resources and challenges, and proposes a forward-looking research roadmap.
Vision-Language Navigation for Aerial Robots: Towards the Era of Large Language Models cs.RO · 2026-04-09 · unverdicted · none · ref 91
This survey organizes aerial vision-language navigation methods into five architectural categories, critically reviews evaluation infrastructure, and synthesizes seven open problems for LLM/VLM integration.

Aerial vision- and-language navigation via semantic-topo-metric representation guided llm reasoning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer