CollabOD: Collaborative Multi-Backbone with Cross-scale Vision for UAV Small Object Detection
read the original abstract
Small object detection in unmanned aerial vehicle (UAV) imagery is challenging because high-altitude viewpoints produce severe scale variation, weak structural cues, and tight computational budgets. Existing lightweight detectors usually fuse multi-scale features after downsampling, where boundary and texture details have already been attenuated and heterogeneous feature streams may be spatially misaligned. To address these issues, we propose CollabOD, a collaborative detection framework that preserves structural details, aligns cross-path features before fusion, and keeps the detection head lightweight at inference time. CollabOD combines a Dual-Path Fusion Stem, a Dense Aggregation Block, a Bilateral Reweighting Module, and a Unified Detail-Aware Head to strengthen localization-oriented representation while limiting extra computation. On VisDrone, CollabOD obtains 52.4 AP50, 30.8 AP75, and 29.9 AP50:95 with 65.5 GFLOPs; on UAVDT it reaches 31.2 AP50 and 17.4 AP50:95; and on AI-TOD it reaches 45.4 AP50 and 20.0 AP50:95 at 137 FPS. The code is available at: https://github.com/Bai-Xuecheng/CollabOD.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.