Cascade R-CNN uses a cascade of detectors trained with progressively higher IoU thresholds to resolve overfitting and quality mismatch, achieving state-of-the-art high-quality object detection and instance segmentation on COCO and other datasets.
Speed/accuracy trade-offs for modern convolutional object detectors
3 Pith papers cite this work. Polarity classification is still indexing.
abstract
The goal of this paper is to serve as a guide for selecting a detection architecture that achieves the right speed/memory/accuracy balance for a given application and platform. To this end, we investigate various ways to trade accuracy for speed and memory usage in modern convolutional object detection systems. A number of successful systems have been proposed in recent years, but apples-to-apples comparisons are difficult due to different base feature extractors (e.g., VGG, Residual Networks), different default image resolutions, as well as different hardware and software platforms. We present a unified implementation of the Faster R-CNN [Ren et al., 2015], R-FCN [Dai et al., 2016] and SSD [Liu et al., 2015] systems, which we view as "meta-architectures" and trace out the speed/accuracy trade-off curve created by using alternative feature extractors and varying other critical parameters such as image size within each of these meta-architectures. On one extreme end of this spectrum where speed and memory are critical, we present a detector that achieves real time speeds and can be deployed on a mobile device. On the opposite end in which accuracy is critical, we present a detector that achieves state-of-the-art performance measured on the COCO detection task.
citation-role summary
citation-polarity summary
roles
baseline 1polarities
baseline 1representative citing papers
MobileNets introduce depthwise separable convolutions plus width and resolution multipliers to produce efficient CNNs that trade off latency and accuracy for mobile and embedded vision applications.
Object detection on satellite images enables estimation of average annual daily truck traffic as a proof-of-concept for regions lacking ground monitoring.
citing papers explorer
-
Cascade R-CNN: High Quality Object Detection and Instance Segmentation
Cascade R-CNN uses a cascade of detectors trained with progressively higher IoU thresholds to resolve overfitting and quality mismatch, achieving state-of-the-art high-quality object detection and instance segmentation on COCO and other datasets.
-
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
MobileNets introduce depthwise separable convolutions plus width and resolution multipliers to produce efficient CNNs that trade off latency and accuracy for mobile and embedded vision applications.
-
Truck Traffic Monitoring with Satellite Images
Object detection on satellite images enables estimation of average annual daily truck traffic as a proof-of-concept for regions lacking ground monitoring.