Object Detection with Deep Learning: A Review

· 2018 · cs.CV · arXiv 1807.05511

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Due to object detection's close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles which combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy and optimization function, etc. In this paper, we provide a review on deep learning based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely Convolutional Neural Network (CNN). Then we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network based learning systems.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

RobustTP: End-to-End Trajectory Prediction for Heterogeneous Road-Agents in Dense Traffic with Noisy Sensor Inputs

cs.RO · 2019-07-20 · unverdicted · novelty 4.0

RobustTP uses a non-linear motion model plus instance segmentation to create noisy trajectories, then an LSTM-CNN to predict 5-second future positions of heterogeneous agents in dense traffic, claiming up to 18% ADE and 35.5% FDE gains over prior methods.

Integration of Object Detection and Small VLMs for Construction Safety Hazard Identification

cs.CV · 2026-04-06 · unverdicted · novelty 4.0

Detection-guided prompting raises small VLM hazard F1 from 34.5% to 50.6% and BERTScore from 0.61 to 0.82 on construction images with only 2.5 ms added latency.

citing papers explorer

Showing 2 of 2 citing papers.

RobustTP: End-to-End Trajectory Prediction for Heterogeneous Road-Agents in Dense Traffic with Noisy Sensor Inputs cs.RO · 2019-07-20 · unverdicted · none · ref 36 · internal anchor
RobustTP uses a non-linear motion model plus instance segmentation to create noisy trajectories, then an LSTM-CNN to predict 5-second future positions of heterogeneous agents in dense traffic, claiming up to 18% ADE and 35.5% FDE gains over prior methods.
Integration of Object Detection and Small VLMs for Construction Safety Hazard Identification cs.CV · 2026-04-06 · unverdicted · none · ref 7
Detection-guided prompting raises small VLM hazard F1 from 34.5% to 50.6% and BERTScore from 0.61 to 0.82 on construction images with only 2.5 ms added latency.

Object Detection with Deep Learning: A Review

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer