X-Stream benchmark shows SOTA MLLMs score ~50% on concurrent multi-stream tasks and lack proactive ability, using a dual-verification pipeline to avoid single-stream bias.
Uav-visloc: A large- scale dataset for uav visual localization
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 5roles
background 2polarities
background 2representative citing papers
Cross3R performs feed-forward 3D reconstruction and 6-DoF pose estimation from any combination of satellite, UAV, and ground images, outperforming baselines on a new 278K-image tri-view dataset.
OrthoTrack is a training-free system for continuous metric 6-DoF UAV pose estimation anchored in public orthophotos and surface models, with a new MovingDrone benchmark dataset.
Introduces AnyVisLoc dataset and unified framework for UAV absolute visual localization, reports 74.1% accuracy within 5 m for best baseline, and proposes PDM@K retrieval metric.
citing papers explorer
-
X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding
X-Stream benchmark shows SOTA MLLMs score ~50% on concurrent multi-stream tasks and lack proactive ability, using a dual-verification pipeline to avoid single-stream bias.
-
Seeing Across Skies and Streets: Feedforward 3D Reconstruction from Satellite, Drone, and Ground Images
Cross3R performs feed-forward 3D reconstruction and 6-DoF pose estimation from any combination of satellite, UAV, and ground images, outperforming baselines on a new 278K-image tri-view dataset.
-
OrthoTrack: Continuous 6-DoF UAV Trajectory Estimation Anchored in Public Orthophotos
OrthoTrack is a training-free system for continuous metric 6-DoF UAV pose estimation anchored in public orthophotos and surface models, with a new MovingDrone benchmark dataset.