MM-UA VBench: How well do multimodal large language models see, think, and plan in low-altitude uav scenarios?

· 2025 · arXiv 2512.23219

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

AirGroundBench: Probing Spatial Intelligence in Multimodal Large Models under Heterogeneous Multi-View Embodied Collaboration

cs.CV · 2026-06-26 · unverdicted · novelty 7.0

AirGroundBench is a new diagnostic benchmark exposing that MLLMs handle basic spatial perception but struggle with cross-view alignment, transformation reasoning, and embodied navigation under heterogeneous air-ground views.

SpatialUAV: Benchmarking Spatial Intelligence for Low-Altitude UAV Perception, Collaboration, and Motion

cs.CV · 2026-06-26 · accept · novelty 7.0 · 2 refs

SpatialUAV releases a new multi-task benchmark for low-altitude UAV spatial intelligence and demonstrates that existing VLMs exhibit clear weaknesses in cross-view association and geometric reasoning.

citing papers explorer

Showing 1 of 1 citing paper after filters.

AirGroundBench: Probing Spatial Intelligence in Multimodal Large Models under Heterogeneous Multi-View Embodied Collaboration cs.CV · 2026-06-26 · unverdicted · none · ref 12
AirGroundBench is a new diagnostic benchmark exposing that MLLMs handle basic spatial perception but struggle with cross-view alignment, transformation reasoning, and embodied navigation under heterogeneous air-ground views.

MM-UA VBench: How well do multimodal large language models see, think, and plan in low-altitude uav scenarios?

fields

years

verdicts

representative citing papers

citing papers explorer