PP-YOLOE: An evolved version of YOLO
read the original abstract
In this report, we present PP-YOLOE, an industrial state-of-the-art object detector with high performance and friendly deployment. We optimize on the basis of the previous PP-YOLOv2, using anchor-free paradigm, more powerful backbone and neck equipped with CSPRepResStage, ET-head and dynamic label assignment algorithm TAL. We provide s/m/l/x models for different practice scenarios. As a result, PP-YOLOE-l achieves 51.4 mAP on COCO test-dev and 78.1 FPS on Tesla V100, yielding a remarkable improvement of (+1.9 AP, +13.35% speed up) and (+1.3 AP, +24.96% speed up), compared to the previous state-of-the-art industrial models PP-YOLOv2 and YOLOX respectively. Further, PP-YOLOE inference speed achieves 149.2 FPS with TensorRT and FP16-precision. We also conduct extensive experiments to verify the effectiveness of our designs. Source code and pre-trained models are available at https://github.com/PaddlePaddle/PaddleDetection.
This paper has not been read by Pith yet.
Forward citations
Cited by 6 Pith papers
-
Adaptive Slicing-Assisted Hyper Inference for Enhanced Small Object Detection in High-Resolution Imagery
ASAHI adaptively slices high-res images into 6 or 12 patches, adds slicing-assisted fine-tuning, and uses Cluster-DIoU-NMS to hit 56.8% mAP on VisDrone2019 and 22.7% on xView while running 20-25% faster than fixed sli...
-
Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline
Presents MMIO benchmark and RTVP method achieving state-of-the-art 42.2% AP in zero-shot industrial defect detection.
-
Efficient RGB-T Object Detection via Sparse Cross-Modality Fusion
A two-stage RGB-T detector performs lightweight modality-specific proposal generation followed by sparse fusion-based refinement to match accuracy of heavier models at lower parameter and compute cost.
-
AERIS: Aerial-Edge Role-Driven Intelligence at Runtime via Orchestrated Language-Model Swarm
AERIS organizes small language models into dynamic roles for edge UAVs with attention-subgoal alignment to enable long-horizon vision-language navigation while preserving real-time closed-loop operation.
-
Structure-Guided Mixed Masked Pretraining and Spatial Continuity Regularization for Printed Circuit Board Defect Detection
A new PCB defect detection method using structure-guided masked pretraining and spatial continuity regularization achieves 85.5% mAP0.5 on the DsPCBSD+ dataset.
-
YOLOv11 Demystified: A Practical Guide to High-Performance Object Detection
YOLOv11 delivers higher mean average precision on standard benchmarks than prior YOLO versions while keeping real-time inference speed through C3K2, SPPF, and C2PSA modules.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.