Frustum PointNets for 3D Object Detection from RGB-D Data

Charles R. Qi; Chenxia Wu; Hao Su; Leonidas J. Guibas; Wei Liu

arxiv: 1711.08488 · v2 · pith:MFMUARAFnew · submitted 2017-11-22 · 💻 cs.CV

Frustum PointNets for 3D Object Detection from RGB-D Data

Charles R. Qi , Wei Liu , Chenxia Wu , Hao Su , Leonidas J. Guibas This is my paper

classification 💻 cs.CV

keywords objectrgb-dcloudsdatadetectionmethodpointdirectly

0 comments

read the original abstract

In this work, we study 3D object detection from RGB-D data in both indoor and outdoor scenes. While previous methods focus on images or 3D voxels, often obscuring natural 3D patterns and invariances of 3D data, we directly operate on raw point clouds by popping up RGB-D scans. However, a key challenge of this approach is how to efficiently localize objects in point clouds of large-scale scenes (region proposal). Instead of solely relying on 3D proposals, our method leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Benefited from learning directly in raw point clouds, our method is also able to precisely estimate 3D bounding boxes even under strong occlusion or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection benchmarks, our method outperforms the state of the art by remarkable margins while having real-time capability.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Deep Radar Detector
cs.CV 2019-06 unverdicted novelty 6.0

A deep neural network trained on radar calibration data with custom augmentations outperforms classical methods on 4D detection while running in real time.
Voxel-FPN: multi-scale voxel feature aggregation in 3D object detection from point clouds
cs.CV 2019-06 unverdicted novelty 4.0

Voxel-FPN proposes an encoder-decoder architecture for multi-scale voxel feature aggregation in one-stage 3D object detection from point clouds, reporting competitive speed and accuracy on KITTI-3D.
End-to-End 3D-PointCloud Semantic Segmentation for Autonomous Driving
cs.CV 2019-06 unverdicted novelty 4.0

Proposes weighted self-incremental transfer learning to address class imbalance in 3D point cloud semantic segmentation and reports a new benchmark on the KITTI dataset.
A review on deep learning techniques for 3D sensed data classification
cs.CV 2019-07 unverdicted novelty 1.0

A survey of deep learning architectures for 3D sensed data classification covering RGB-D, multi-view, volumetric and end-to-end methods along with datasets and future directions.