CollabOD: Collaborative Multi-Backbone with Cross-scale Vision for UAV Small Object Detection

Chuanzhi Xu; Jun Guo; Kang Han; Pengfei Ye; Xuecheng Bai; Yuxiang Wang

read the original abstract

Small object detection in unmanned aerial vehicle (UAV) imagery is challenging because high-altitude viewpoints produce severe scale variation, weak structural cues, and tight computational budgets. Existing lightweight detectors usually fuse multi-scale features after downsampling, where boundary and texture details have already been attenuated and heterogeneous feature streams may be spatially misaligned. To address these issues, we propose CollabOD, a collaborative detection framework that preserves structural details, aligns cross-path features before fusion, and keeps the detection head lightweight at inference time. CollabOD combines a Dual-Path Fusion Stem, a Dense Aggregation Block, a Bilateral Reweighting Module, and a Unified Detail-Aware Head to strengthen localization-oriented representation while limiting extra computation. On VisDrone, CollabOD obtains 52.4 AP50, 30.8 AP75, and 29.9 AP50:95 with 65.5 GFLOPs; on UAVDT it reaches 31.2 AP50 and 17.4 AP50:95; and on AI-TOD it reaches 45.4 AP50 and 20.0 AP50:95 at 137 FPS. The code is available at: https://github.com/Bai-Xuecheng/CollabOD.

CollabOD: Collaborative Multi-Backbone with Cross-scale Vision for UAV Small Object Detection

discussion (0)