UAVDB: Point-Guided Masks for UAV Detection and Segmentation
Pith reviewed 2026-05-23 20:30 UTC · model grok-4.3
The pith
A UAV dataset is constructed from video trajectory points via intensity convergence to produce boxes and masks without manual labeling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that Patch Intensity Convergence combined with SAM2 produces higher IoU than existing annotation techniques, enabling UAVDB with multi-scale UAV instances and YOLO detector baselines.
What carries the argument
Patch Intensity Convergence (PIC), a lightweight method that converts trajectory points into bounding boxes by analyzing patch intensities.
If this is right
- Large UAV datasets can be built at lower labeling cost while retaining precise spatial localization.
- Segmentation masks become available alongside detection labels for multi-task model training.
- YOLO-based detectors gain concrete baselines on UAVs ranging from clear objects to single-pixel instances.
- Future annotation pipelines can start from trajectory points rather than full manual box drawing.
Where Pith is reading between the lines
- The same point-to-box conversion could annotate other small moving targets in fixed-camera video if trajectory data is present.
- Swapping SAM2 for alternative mask generators might improve results under specific lighting or motion conditions.
- UAVDB instances near single-pixel size offer a direct test for how current detectors handle the smallest detectable objects.
Load-bearing premise
Trajectory points from the source multi-view video are accurate and dense enough to generate reliable bounding boxes without additional manual correction.
What would settle it
Manually creating ground-truth boxes on a held-out subset of frames and finding that PIC-generated boxes have substantially lower IoU than the reported values would falsify the performance claim.
Figures
read the original abstract
Accurate detection of Unmanned Aerial Vehicles (UAVs) is critical for surveillance, security, and airspace monitoring. However, existing datasets remain limited in scale, resolution, and the ability to capture objects across extreme size variations. To address these challenges, we present UAVDB, a benchmark dataset for UAV detection and segmentation, constructed via a point-guided weak supervision pipeline. We introduce Patch Intensity Convergence (PIC), a lightweight annotation method that converts trajectory points into bounding boxes, eliminating the need for manual labeling while preserving precise spatial localization. Building upon these annotations, we further generate segmentation masks using SAM2, enriching the dataset with multi-task labels. UAVDB consists of RGB frames from a fixed-camera multi-view video dataset, capturing UAVs across scales ranging from clearly visible objects to near single-pixel instances under diverse conditions. Quantitative results show that PIC combined with SAM2 outperforms existing annotation techniques in terms of IoU. Furthermore, we benchmark YOLO-based detectors on UAVDB, establishing baselines for future research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents UAVDB, a benchmark dataset for UAV detection and segmentation constructed from fixed-camera multi-view video via a point-guided weak supervision pipeline. It introduces Patch Intensity Convergence (PIC) to convert trajectory points into bounding boxes without manual labeling, then applies SAM2 to generate segmentation masks. The central claims are that PIC+SAM2 outperforms prior annotation techniques on IoU and that YOLO-based detectors provide useful baselines on this dataset spanning extreme scale variations including near-single-pixel UAVs.
Significance. If the IoU gains are reproducible and the trajectory seeds prove reliable, the work supplies a practical, scalable annotation route for small-object UAV data that existing manual or fully supervised pipelines struggle to produce at volume. The resulting multi-task labels and YOLO baselines could serve as a reference point for future surveillance and airspace-monitoring research.
major comments (1)
- [§4] §4 (Experiments) and the quantitative IoU claim: the headline result that PIC+SAM2 outperforms existing annotation techniques rests on the unvalidated premise that the source trajectory points are sufficiently accurate and dense to serve as faithful seeds for PIC box generation. No manual ground-truth comparison, density statistics, or subset validation against independent location annotations is described; for near-single-pixel UAVs any systematic offset would render the reported IoU advantage non-diagnostic.
minor comments (2)
- [Abstract] The abstract states an IoU improvement but supplies no numerical values, error bars, or dataset-split details; these should be added for immediate readability.
- [Methods] Notation for PIC parameters and the exact conversion from trajectory points to boxes should be formalized with an equation or pseudocode in the methods section.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The concern about validating the source trajectory points is well-taken, and we address it directly below with a commitment to strengthen the manuscript.
read point-by-point responses
-
Referee: [§4] §4 (Experiments) and the quantitative IoU claim: the headline result that PIC+SAM2 outperforms existing annotation techniques rests on the unvalidated premise that the source trajectory points are sufficiently accurate and dense to serve as faithful seeds for PIC box generation. No manual ground-truth comparison, density statistics, or subset validation against independent location annotations is described; for near-single-pixel UAVs any systematic offset would render the reported IoU advantage non-diagnostic.
Authors: The trajectory points originate from the source fixed-camera multi-view video dataset, which supplies them as part of its original construction. We agree that the current manuscript does not include explicit validation (density statistics or manual ground-truth comparison), which is a limitation for claims involving near-single-pixel objects. In the revised version we will add to §4: (i) point-density statistics stratified by UAV scale, and (ii) a validation subset in which independent manual location annotations are compared against the source points and the resulting PIC boxes. This will directly test whether systematic offsets exist and whether the reported IoU advantage remains diagnostic. revision: yes
Circularity Check
No circularity: empirical dataset construction without derivation chain
full rationale
The paper introduces UAVDB via a point-guided pipeline (PIC for boxes from trajectories, SAM2 for masks) and reports empirical IoU benchmarks plus YOLO baselines. No equations, parameter fitting, predictions, or self-citation chains appear in the provided text. The central claims rest on the described annotation process and external comparisons rather than any reduction of outputs to inputs by construction. This matches the reader's 0.0 assessment; honest non-finding applies.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Bounding box priors for cell detection with point anno- tations
Hari Om Aggrawal, Dipam Goswami, and Vinti Agarwal. Bounding box priors for cell detection with point anno- tations. In 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), pages 1–4. IEEE, 2023. 3
work page 2023
-
[2]
Drone dataset: Amateur un- manned air vehicle detection
Mehmet C ¸ agrı Aksoy, Alp Sezer Orak, Hasan Mertcan ¨Ozkan, and Bilgin Selimoglu. Drone dataset: Amateur un- manned air vehicle detection. Mendeley Data, 4:2019, 2019. 2
work page 2019
-
[3]
Image Segmentation by Using Threshold Techniques
Salem Saleh Al-Amri, Namdeo V Kalyankar, et al. Image segmentation by using threshold techniques. arXiv preprint arXiv:1005.4020, 2010. 3, 4, 5
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[4]
aydin. mta dataset. https://universe.roboflow. com/aydin/mta-rwowu , 2024. visited on 2025-07-16. 2
work page 2024
-
[5]
Okutama-action: An aerial view video dataset for concurrent human action detection
Mohammadamin Barekatain, Miquel Mart ´ı, Hsueh-Fu Shih, Samuel Murray, Kotaro Nakayama, Yutaka Matsuo, and Hel- mut Prendinger. Okutama-action: An aerial view video dataset for concurrent human action detection. In Proceed- ings of the IEEE conference on computer vision and pattern recognition workshops, pages 28–35, 2017. 1
work page 2017
-
[6]
Au-air: A multi-modal un- manned aerial vehicle dataset for low altitude traffic surveil- lance
Ilker Bozcan and Erdal Kayacan. Au-air: A multi-modal un- manned aerial vehicle dataset for low altitude traffic surveil- lance. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 8504–8510. IEEE, 2020. 1
work page 2020
-
[7]
Leveraging point annotations in segmentation learning with boundary loss
Eva Breznik, Hoel Kervadec, Filip Malmberg, Joel Kullberg, H˚akan Ahlstr ¨om, Marleen de Bruijne, and Robin Strand. Leveraging point annotations in segmentation learning with boundary loss. In International Conference on Pattern Recognition, pages 194–210. Springer, 2024. 3
work page 2024
-
[8]
End-to- end object detection with transformers
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to- end object detection with transformers. In European confer- ence on computer vision, pages 213–229. Springer, 2020. 1
work page 2020
-
[9]
Points as queries: Weakly semi-supervised object 7 Figure 5
Liangyu Chen, Tong Yang, Xiangyu Zhang, Wei Zhang, and Jian Sun. Points as queries: Weakly semi-supervised object 7 Figure 5. Sequential tracking results predicted by YOLOv12n-seg [60] on the entirely unseen Dataset 5. Top: Camera 3. Bottom: Camera
-
[10]
Left to right shows consecutive video frames. detection by points. In Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition , pages 8823–8832, 2021. 3
work page 2021
-
[11]
P2object: Single point supervised object detection and in- stance segmentation
Pengfei Chen, Xuehui Yu, Xumeng Han, Kuiran Wang, Guorong Li, Lingxi Xie, Zhenjun Han, and Jianbin Jiao. P2object: Single point supervised object detection and in- stance segmentation. International Journal of Computer Vi- sion, pages 1–25, 2025. 3
work page 2025
-
[12]
ConcordiaNA VLab. Drone dataset. https : / / universe . roboflow . com / concordianavlab / drone-9ab2n, 2023. visited on 2025-07-16. 2
work page 2023
-
[13]
Weakly semi-supervised infrared small target de- tection guided by point labels
Xiaolong Cui, Xingxiu Li, Panlong Wu, Shan He, and Ruo- han Zhao. Weakly semi-supervised infrared small target de- tection guided by point labels. IEEE Transactions on Geo- science and Remote Sensing, 2025. 3
work page 2025
-
[14]
At- tentional Local Contrast Networks for Infrared Small Target Detection
Yimian Dai, Yiquan Wu, Fei Zhou, and Kobus Barnard. At- tentional Local Contrast Networks for Infrared Small Target Detection. IEEE Transactions on Geoscience and Remote Sensing, pages 1–12, 2021. 2
work page 2021
-
[15]
Asymmetric contextual modulation for infrared small target detection
Yimian Dai, Yiquan Wu, Fei Zhou, and Kobus Barnard. Asymmetric contextual modulation for infrared small target detection. In IEEE Winter Conference on Applications of Computer Vision, WACV 2021, 2021
work page 2021
-
[16]
One-Stage Cascade Refinement Networks for Infrared Small Target Detection
Yimian Dai, Xiang Li, Fei Zhou, Yulei Qian, Yaohong Chen, and Jian Yang. One-Stage Cascade Refinement Networks for Infrared Small Target Detection. IEEE Transactions on Geoscience and Remote Sensing, pages 1–17, 2023. 2
work page 2023
-
[17]
Object detection in aerial im- ages: A large-scale benchmark and challenges
Jian Ding, Nan Xue, Gui-Song Xia, Xiang Bai, Wen Yang, Michael Ying Yang, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, et al. Object detection in aerial im- ages: A large-scale benchmark and challenges. IEEE trans- actions on pattern analysis and machine intelligence, 44(11): 7778–7796, 2021. 1
work page 2021
-
[18]
Drone. Drone dataset. https : / / universe . roboflow . com / drone - blb9h / drone - evttd ,
-
[20]
The unmanned aerial vehicle benchmark: Object detection and tracking
Dawei Du, Yuankai Qi, Hongyang Yu, Yifan Yang, Kaiwen Duan, Guorong Li, Weigang Zhang, Qingming Huang, and Qi Tian. The unmanned aerial vehicle benchmark: Object detection and tracking. In Proceedings of the European con- ference on computer vision (ECCV) , pages 370–386, 2018. 1
work page 2018
-
[21]
Caniget- theuploadactuallyworking dataset
flippinggreatwodgesofdroneimages1. Caniget- theuploadactuallyworking dataset. https : / / universe . roboflow . com / flippinggreatwodgesofdroneimages1 / canigettheuploadactuallyworking, 2022. visited on 2025-07-16. 2
work page 2022
-
[22]
Leveraging imagery data with spatial point prior for weakly semi-supervised 3d object detection
Hongzhi Gao, Zheng Chen, Zehui Chen, Lin Chen, Jiaming Liu, Shanghang Zhang, and Feng Zhao. Leveraging imagery data with spatial point prior for weakly semi-supervised 3d object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 1797–1805, 2024. 3
work page 2024
-
[23]
Point-teaching: weakly semi- supervised object detection with point annotations
Yongtao Ge, Qiang Zhou, Xinlong Wang, Chunhua Shen, Zhibin Wang, and Hao Li. Point-teaching: weakly semi- supervised object detection with point annotations. In Pro- ceedings of the AAAI Conference on Artificial Intelligence , pages 667–675, 2023. 3
work page 2023
-
[24]
Ganta Gourish. mobile net dataset. https://universe. roboflow . com / ganta - gourish / mobile - net,
-
[25]
visited on 2025-07-16. 2
work page 2025
-
[26]
Yolomg: Vision- based drone-to-drone detection with appearance and pixel- level motion fusion
Hanqing Guo, Xiuxiu Lin, and Shiyu Zhao. Yolomg: Vision- based drone-to-drone detection with appearance and pixel- level motion fusion. arXiv preprint arXiv:2503.07115, 2025. 2
-
[27]
Drone- based object counting by spatially regularized regional pro- posal network
Meng-Ru Hsieh, Yen-Liang Lin, and Winston H Hsu. Drone- based object counting by spatially regularized regional pro- posal network. In Proceedings of the IEEE international conference on computer vision, pages 4145–4153, 2017. 1
work page 2017
-
[28]
Anti-uav410: A thermal infrared benchmark and customized scheme for tracking drones in the wild
Bo Huang, Jianan Li, Junjie Chen, Gang Wang, Jian Zhao, and Tingfa Xu. Anti-uav410: A thermal infrared benchmark and customized scheme for tracking drones in the wild. T- PAMI, 2023. 2
work page 2023
-
[29]
Anti-uav: a large-scale benchmark for vision-based uav tracking
Nan Jiang, Kuiran Wang, Xiaoke Peng, Xuehui Yu, Qiang Wang, Junliang Xing, Guorong Li, Qixiang Ye, Jianbin Jiao, Zhenjun Han, et al. Anti-uav: a large-scale benchmark for vision-based uav tracking. T-MM, 2021. 2
work page 2021
-
[30]
Glenn Jocher and Jing Qiu. Ultralytics yolo11, 2024. 1, 2, 5, 6, 7 8
work page 2024
-
[31]
Glenn Jocher, Ayush Chaurasia, and Jing Qiu. YOLO by Ultralytics, 2023. 1, 2, 5, 6, 7
work page 2023
-
[32]
Aniket Jog. dron3 dataset. https : / / universe . roboflow.com/aniket-jog-0whc0/dron3 , 2023. visited on 2025-07-16. 2
work page 2023
-
[33]
Dronesurf: Benchmark dataset for drone-based face recognition
Isha Kalra, Maneet Singh, Shruti Nagpal, Richa Singh, Mayank Vatsa, and PB Sujit. Dronesurf: Benchmark dataset for drone-based face recognition. In2019 14th IEEE Interna- tional Conference on Automatic Face & Gesture Recognition (FG 2019), pages 1–7. IEEE, 2019. 1
work page 2019
-
[34]
Sky monitoring system for flying object detection us- ing 4k resolution camera
Takehiro Kashiyama, Hideaki Sobue, and Yoshihide Seki- moto. Sky monitoring system for flying object detection us- ing 4k resolution camera. Sensors, 20(24):7071, 2020. 2
work page 2020
-
[35]
Beomyoung Kim, Joonhyun Jeong, Dongyoon Han, and Sung Ju Hwang. The devil is in the points: Weakly semi- supervised instance segmentation via point-guided mask rep- resentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 11360– 11370, 2023. 3
work page 2023
-
[36]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. In Proceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 4015–4026, 2023. 3, 4, 5
work page 2023
-
[37]
Yolov13: Real-time object detection with hypergraph-enhanced adaptive visual perception
Mengqi Lei, Siqi Li, Yihong Wu, Han Hu, You Zhou, Xinhu Zheng, Guiguang Ding, Shaoyi Du, Zongze Wu, and Yue Gao. Yolov13: Real-time object detection with hypergraph-enhanced adaptive visual perception. arXiv preprint arXiv:2506.17733, 2025. 1, 2, 5, 6, 7
-
[38]
Boyang Li, Yingqian Wang, Longguang Wang, Fei Zhang, Ting Liu, Zaiping Lin, Wei An, and Yulan Guo. Monte carlo linear clustering with single-point supervision is enough for infrared small target detection. In Proceedings of the IEEE/CVF international conference on computer vision , pages 1009–1019, 2023. 3
work page 2023
-
[39]
A level set annotation framework with single-point supervision for infrared small target detection
Haoqing Li, Jinfu Yang, Yifei Xu, and Runshi Wang. A level set annotation framework with single-point supervision for infrared small target detection. IEEE Signal Processing Let- ters, 31:451–455, 2024. 3
work page 2024
-
[40]
Multi-target detection and tracking from a single camera in unmanned aerial ve- hicles (uavs)
Jing Li, Dong Hye Ye, Timothy Chung, Mathias Kolsch, Juan Wachs, and Charles Bouman. Multi-target detection and tracking from a single camera in unmanned aerial ve- hicles (uavs). In 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 4992–4997. IEEE, 2016. 2
work page 2016
-
[41]
Reconstruction of 3d flight trajectories from ad-hoc camera networks
Jingtong Li, Jesse Murray, Dorina Ismaili, Konrad Schindler, and Cenek Albl. Reconstruction of 3d flight trajectories from ad-hoc camera networks. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1621–1628. IEEE, 2020. 1, 2, 3, 4, 7
work page 2020
-
[42]
Visual object tracking for un- manned aerial vehicles: A benchmark and new motion mod- els
Siyi Li and Dit-Yan Yeung. Visual object tracking for un- manned aerial vehicles: A benchmark and new motion mod- els. In Proceedings of the AAAI conference on artificial in- telligence, 2017. 1
work page 2017
-
[43]
Weakly semi- supervised object detection with point annotations in reti- nal oct images
Xiaoming Liu, Xin Zhu, and Jinshan Tang. Weakly semi- supervised object detection with point annotations in reti- nal oct images. In 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC) , pages 3991–3995. IEEE, 2023. 3
work page 2023
-
[44]
Pointobb: Learning oriented object de- tection via single point supervision
Junwei Luo, Xue Yang, Yi Yu, Qingyun Li, Junchi Yan, and Yansheng Li. Pointobb: Learning oriented object de- tection via single point supervision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16730–16740, 2024. 3
work page 2024
-
[45]
Mor-uav: A benchmark dataset and baselines for moving object recognition in uav videos
Murari Mandal, Lav Kush Kumar, and Santosh Kumar Vip- parthi. Mor-uav: A benchmark dataset and baselines for moving object recognition in uav videos. In Proceedings of the 28th ACM international conference on multimedia, pages 2626–2635, 2020. 1
work page 2020
-
[46]
Polo–point-based, multi-class animal detec- tion
Giacomo May, Emanuele Dalsasso, Benjamin Kellenberger, and Devis Tuia. Polo–point-based, multi-class animal detec- tion. In European Conference on Computer Vision , pages 169–177. Springer, 2024. 3
work page 2024
-
[47]
Spartan hpc-cloud hybrid: delivering performance and flexibility
Bernard Meade, Lev Lafayette, Greg Sauter, and Daniel Tosello. Spartan hpc-cloud hybrid: delivering performance and flexibility. University of Melbourne, 10:49, 2017. 5
work page 2017
-
[48]
The common objects underwater (cou) dataset for robust underwater object detection
Rishi Mukherjee, Sakshi Singh, Jack McWilliams, and Ju- naed Sattar. The common objects underwater (cou) dataset for robust underwater object detection. arXiv preprint arXiv:2502.20651, 2025. 5
-
[49]
A large contextual dataset for classifica- tion, detection and counting of cars with deep learning
T Nathan Mundhenk, Goran Konjevod, Wesam A Sakla, and Kofi Boakye. A large contextual dataset for classifica- tion, detection and counting of cars with deep learning. In European conference on computer vision , pages 785–800. Springer, 2016. 1
work page 2016
-
[50]
Real world object detection dataset for quadcopter unmanned aerial vehicle de- tection
Maciej Pawełczyk and Marek Wojtyra. Real world object detection dataset for quadcopter unmanned aerial vehicle de- tection. IEEE Access, 8:174394–174409, 2020. 2
work page 2020
-
[51]
SAM 2: Segment Anything in Images and Videos
Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman R¨adle, Chloe Rolland, Laura Gustafson, et al. Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714, 2024. 2, 3, 4, 5
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[52]
Vehicle detec- tion in aerial imagery: A small target detection benchmark
Sebastien Razakarivony and Frederic Jurie. Vehicle detec- tion in aerial imagery: A small target detection benchmark. Journal of Visual Communication and Image Representation, 34:187–203, 2016. 1
work page 2016
-
[53]
Airborne object tracking dataset,
AWS Open Data Registry. Airborne object tracking dataset,
- [54]
-
[55]
Learning social etiquette: Human tra- jectory understanding in crowded scenes
Alexandre Robicquet, Amir Sadeghian, Alexandre Alahi, and Silvio Savarese. Learning social etiquette: Human tra- jectory understanding in crowded scenes. In European con- ference on computer vision, pages 549–565. Springer, 2016. 1
work page 2016
- [56]
-
[57]
SOTA Real-Time Object Detection Model. 1
-
[58]
” grabcut” interactive foreground extraction using iterated graph cuts
Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. ” grabcut” interactive foreground extraction using iterated graph cuts. ACM transactions on graphics (TOG) , 23(3): 309–314, 2004. 3, 4, 5
work page 2004
-
[59]
Flying objects detection from a single moving camera
Artem Rozantsev, Vincent Lepetit, and Pascal Fua. Flying objects detection from a single moving camera. In Proceed- ings of the IEEE conference on computer vision and pattern recognition, pages 4128–4136, 2015. 2 9
work page 2015
-
[60]
SegmentDrones. Segmentationdrones dataset. https: / / universe . roboflow . com / segmentdrones / segmentationdrones, 2023. visited on 2025-07-16. 2
work page 2023
-
[61]
The aircraft context dataset: Understanding and optimizing data variability in aerial domains
Daniel Steininger, Verena Widhalm, Julia Simon, Andreas Kriegler, and Christoph Sulzbachner. The aircraft context dataset: Understanding and optimizing data variability in aerial domains. In Proceedings of the IEEE/CVF Interna- tional Conference on Computer Vision , pages 3823–3832,
-
[62]
Efficientdet: Scalable and efficient object detection
Mingxing Tan, Ruoming Pang, and Quoc V Le. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10781–10790, 2020. 1
work page 2020
-
[63]
Point-based weakly semi- supervised oriented vehicle detection in optical remote sens- ing images
Ziqian Tan and Chen Wu. Point-based weakly semi- supervised oriented vehicle detection in optical remote sens- ing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024. 3
work page 2024
-
[64]
Weakly semi-supervised ori- ented with points for remote sensing vehicle detection
Ziqian Tan and Chen Wu. Weakly semi-supervised ori- ented with points for remote sensing vehicle detection. In IGARSS 2024-2024 IEEE International Geoscience and Re- mote Sensing Symposium, pages 9294–9297. IEEE, 2024. 3
work page 2024
-
[65]
Yolov12: Attention-centric real-time object detectors, 2025
Yunjie Tian, Qixiang Ye, and David Doermann. Yolov12: Attention-centric real-time object detectors, 2025. 1, 2, 5, 6, 7, 8
work page 2025
-
[66]
Utilizing class-agnostic point-to-box regressors as object proposal generators
Gulin Tufekci Dogan, Ramazan Gokberk Cinbis, and Ilkay Ulusoy. Utilizing class-agnostic point-to-box regressors as object proposal generators. In European Conference on Computer Vision, pages 253–269. Springer, 2024. 3
work page 2024
-
[67]
Yolov10: Real-time end-to-end object detection,
Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jun- gong Han, and Guiguang Ding. Yolov10: Real-time end- to-end object detection. arXiv preprint arXiv:2405.14458 ,
-
[68]
Yolov9: Learning what you want to learn using programmable gradient information,
Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. Yolov9: Learning what you want to learn us- ing programmable gradient information. arXiv preprint arXiv:2402.13616, 2024. 1, 2, 5, 6, 7
-
[69]
Tiny object detection in aerial images
Jinwang Wang, Wen Yang, Haowen Guo, Ruixiang Zhang, and Gui-Song Xia. Tiny object detection in aerial images. In 2020 25th international conference on pattern recognition (ICPR), pages 3791–3798. IEEE, 2021. 1
work page 2020
-
[70]
Point-to-rbox net- work for oriented object detection via single point supervi- sion
Yucheng Wang, Chu He, and Xi Chen. Point-to-rbox net- work for oriented object detection via single point supervi- sion. In BMVC, pages 323–325, 2023. 3
work page 2023
-
[71]
Sanjoeng Wong. Bcr-net: Boundary-category refinement network for weakly semi-supervised x-ray prohibited item detection with points. arXiv preprint arXiv:2412.18918 ,
-
[72]
WorkspaceTest1. Air-detect dataset. https : //universe.roboflow.com/workspacetest1- t9dog/air-detect, 2025. visited on 2025-07-16. 2
work page 2025
-
[73]
Uavd4l: A large-scale dataset for uav 6-dof localization
Rouwan Wu, Xiaoya Cheng, Juelin Zhu, Xuxiang Liu, Mao- jun Zhang, and Shen Yan. Uavd4l: A large-scale dataset for uav 6-dof localization. arXiv preprint arXiv:2401.05971,
-
[74]
Dota: A large-scale dataset for object detection in aerial images
Gui-Song Xia, Xiang Bai, Jian Ding, Zhen Zhu, Serge Be- longie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, and Liang- pei Zhang. Dota: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3974–3983, 2018
work page 2018
-
[75]
Detecting tiny objects in aerial images: A normalized wasserstein distance and a new benchmark
Chang Xu, Jinwang Wang, Wen Yang, Huai Yu, Lei Yu, and Gui-Song Xia. Detecting tiny objects in aerial images: A normalized wasserstein distance and a new benchmark. IS- PRS Journal of Photogrammetry and Remote Sensing , 190: 79–93, 2022. 1
work page 2022
-
[76]
Deep GrabCut for Object Selection
Ning Xu, Brian Price, Scott Cohen, Jimei Yang, and Thomas Huang. Deep grabcut for object selection. arXiv preprint arXiv:1707.00243, 2017. 3
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[77]
Position-based anchor opti- mization for point supervised dense nuclei detection
Jieru Yao, Longfei Han, Guangyu Guo, Zhaohui Zheng, Runmin Cong, Xiankai Huang, Jin Ding, Kaihui Yang, Ding- wen Zhang, and Junwei Han. Position-based anchor opti- mization for point supervised dense nuclei detection. Neural Networks, 171:159–170, 2024. 3
work page 2024
-
[78]
Xinyi Ying, Li Liu, Yingqian Wang, Ruojing Li, Nuo Chen, Zaiping Lin, Weidong Sheng, and Shilin Zhou. Mapping degeneration meets label evolution: Learning infrared small target detection with single point supervision. In Proceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15528–15538, 2023. 3
work page 2023
-
[79]
Yi Yu, Xue Yang, Qingyun Li, Feipeng Da, Jifeng Dai, Yu Qiao, and Junchi Yan. Point2rbox: Combine knowledge from synthetic visual patterns for end-to-end oriented object detection with single point supervision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16783–16793, 2024
work page 2024
-
[80]
Group r-cnn for weakly semi- supervised object detection with points
Shilong Zhang, Zhuoran Yu, Liyang Liu, Xinjiang Wang, Aojun Zhou, and Kai Chen. Group r-cnn for weakly semi- supervised object detection with points. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9417–9426, 2022
work page 2022
-
[81]
Weakly semi-supervised oriented object detection with points
Ziming Zhang, Yucheng Wang, Chu He, Qingyi Zhang, and Xi Chen. Weakly semi-supervised oriented object detection with points. In 2023 IEEE International Conference on Im- age Processing (ICIP), pages 3080–3084. IEEE, 2023. 3
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.