DenseBox: Unifying Landmark Localization with End to End Object Detection

· 2015 · cs.CV · arXiv 1509.04874

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

How can a single fully convolutional neural network (FCN) perform on object detection? We introduce DenseBox, a unified end-to-end FCN framework that directly predicts bounding boxes and object class confidences through all locations and scales of an image. Our contribution is two-fold. First, we show that a single FCN, if designed and optimized carefully, can detect multiple different objects extremely accurately and efficiently. Second, we show that when incorporating with landmark localization during multi-task learning, DenseBox further improves object detection accuray. We present experimental results on public benchmark datasets including MALF face detection and KITTI car detection, that indicate our DenseBox is the state-of-the-art system for detecting challenging objects such as faces and cars.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Object Detection in Video with Spatial-temporal Context Aggregation

cs.CV · 2019-07-11 · unverdicted · novelty 6.0

Proposal-level spatio-temporal context aggregation for video object detection achieves 80.3% mAP on ImageNet VID, improving Faster R-CNN baseline by 5.8%.

Adapted Center and Scale Prediction: More Stable and More Accurate

cs.CV · 2020-02-20 · unverdicted · novelty 4.0

Adaptations to CSP including compressing width prediction achieve 9.3% MR on CityPersons reasonable set, showing anchor-free one-stage detectors can reach high accuracy.

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

cs.MM · 2024-10-28 · unverdicted · novelty 3.0

Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.

citing papers explorer

Showing 3 of 3 citing papers.

Object Detection in Video with Spatial-temporal Context Aggregation cs.CV · 2019-07-11 · unverdicted · none · ref 5 · internal anchor
Proposal-level spatio-temporal context aggregation for video object detection achieves 80.3% mAP on ImageNet VID, improving Faster R-CNN baseline by 5.8%.
Adapted Center and Scale Prediction: More Stable and More Accurate cs.CV · 2020-02-20 · unverdicted · none · ref 12 · internal anchor
Adaptations to CSP including compressing width prediction achieve 9.3% MR on CityPersons reasonable set, showing anchor-free one-stage detectors can reach high accuracy.
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction cs.MM · 2024-10-28 · unverdicted · none · ref 85 · internal anchor
Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.

DenseBox: Unifying Landmark Localization with End to End Object Detection

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer