Scalable, High-Quality Object Detection
read the original abstract
Current high-quality object detection approaches use the scheme of salience-based object proposal methods followed by post-classification using deep convolutional features. This spurred recent research in improving object proposal methods. However, domain agnostic proposal generation has the principal drawback that the proposals come unranked or with very weak ranking, making it hard to trade-off quality for running time. This raises the more fundamental question of whether high-quality proposal generation requires careful engineering or can be derived just from data alone. We demonstrate that learning-based proposal methods can effectively match the performance of hand-engineered methods while allowing for very efficient runtime-quality trade-offs. Using the multi-scale convolutional MultiBox (MSC-MultiBox) approach, we substantially advance the state-of-the-art on the ILSVRC 2014 detection challenge data set, with $0.5$ mAP for a single model and $0.52$ mAP for an ensemble of two models. MSC-Multibox significantly improves the proposal quality over its predecessor MultiBox~method: AP increases from $0.42$ to $0.53$ for the ILSVRC detection challenge. Finally, we demonstrate improved bounding-box recall compared to Multiscale Combinatorial Grouping with less proposals on the Microsoft-COCO data set.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Adaptive Multi-Scale Goodness Aggregation for Forward-Forward Learning
AMSGA extends Forward-Forward learning via multi-scale goodness aggregation, curriculum-guided hard negative mining, and adaptive thresholds, reporting up to 1.5% accuracy gains on MNIST and Fashion-MNIST.
-
Towards Adversarially Robust Object Detection
Develops a multi-task learning based adversarial training approach to improve robustness of object detectors to adversarial attacks, with experiments on PASCAL-VOC and MS-COCO.
-
Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning
Proposes Distill-2MD-MTL, an MTL-based data distillation framework for semi-supervised multi-domain face analysis tasks that claims better performance than single-task baselines.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.