Scalable, High-Quality Object Detection

Christian Szegedy; Dragomir Anguelov; Dumitru Erhan; Scott Reed; Sergey Ioffe

arxiv: 1412.1441 · v3 · pith:IFTCOPHGnew · submitted 2014-12-03 · 💻 cs.CV

Scalable, High-Quality Object Detection

Christian Szegedy , Scott Reed , Dumitru Erhan , Dragomir Anguelov , Sergey Ioffe This is my paper

classification 💻 cs.CV

keywords proposaldetectionmethodsobjectdatahigh-qualitychallengeconvolutional

0 comments

read the original abstract

Current high-quality object detection approaches use the scheme of salience-based object proposal methods followed by post-classification using deep convolutional features. This spurred recent research in improving object proposal methods. However, domain agnostic proposal generation has the principal drawback that the proposals come unranked or with very weak ranking, making it hard to trade-off quality for running time. This raises the more fundamental question of whether high-quality proposal generation requires careful engineering or can be derived just from data alone. We demonstrate that learning-based proposal methods can effectively match the performance of hand-engineered methods while allowing for very efficient runtime-quality trade-offs. Using the multi-scale convolutional MultiBox (MSC-MultiBox) approach, we substantially advance the state-of-the-art on the ILSVRC 2014 detection challenge data set, with $0.5$ mAP for a single model and $0.52$ mAP for an ensemble of two models. MSC-Multibox significantly improves the proposal quality over its predecessor MultiBox~method: AP increases from $0.42$ to $0.53$ for the ILSVRC detection challenge. Finally, we demonstrate improved bounding-box recall compared to Multiscale Combinatorial Grouping with less proposals on the Microsoft-COCO data set.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Adaptive Multi-Scale Goodness Aggregation for Forward-Forward Learning
cs.LG 2026-05 unverdicted novelty 6.0

AMSGA extends Forward-Forward learning via multi-scale goodness aggregation, curriculum-guided hard negative mining, and adaptive thresholds, reporting up to 1.5% accuracy gains on MNIST and Fashion-MNIST.
Towards Adversarially Robust Object Detection
cs.CV 2019-07 unverdicted novelty 5.0

Develops a multi-task learning based adversarial training approach to improve robustness of object detectors to adversarial attacks, with experiments on PASCAL-VOC and MS-COCO.
Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning
cs.CV 2019-07 unverdicted novelty 4.0

Proposes Distill-2MD-MTL, an MTL-based data distillation framework for semi-supervised multi-domain face analysis tasks that claims better performance than single-task baselines.