QuadAgent uses an asynchronous multi-agent architecture with an Impression Graph for scene memory and vision-based avoidance to enable training-free vision-language guided agile quadrotor flight, outperforming baselines in simulations and achieving real-world speeds up to 5 m/s.
Spatialnav: Leveraging spatial scene graphs for zero-shot vision-and-language navigation
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4representative citing papers
SEDualVLN proposes a spatially-enhanced dual-system VLN framework that pairs a fast VLM action generator with a slow MLLM waypoint planner and reports state-of-the-art results on VLN-CE benchmarks.
SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.
ABot-Explorer unifies online exploration and hierarchical semantic memory construction via VLM-distilled navigational affordances for improved embodied navigation efficiency.
citing papers explorer
-
QuadAgent: A Responsive Agent System for Vision-Language Guided Quadrotor Agile Flight
QuadAgent uses an asynchronous multi-agent architecture with an Impression Graph for scene memory and vision-based avoidance to enable training-free vision-language guided agile quadrotor flight, outperforming baselines in simulations and achieving real-world speeds up to 5 m/s.
-
SEDualVLN: A Spatially-Enhanced Dual-System for Vision-Language Navigation
SEDualVLN proposes a spatially-enhanced dual-system VLN framework that pairs a fast VLM action generator with a slow MLLM waypoint planner and reports state-of-the-art results on VLN-CE benchmarks.
-
SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation
SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.
-
Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents
ABot-Explorer unifies online exploration and hierarchical semantic memory construction via VLM-distilled navigational affordances for improved embodied navigation efficiency.