DeepLabv3 improves semantic segmentation by capturing multi-scale context with cascaded or parallel atrous convolutions and adding global context to ASPP, achieving better results on PASCAL VOC 2012 without DenseCRF post-processing.
Understanding Convolution for Semantic Segmentation
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
Recent advances in deep learning, especially deep convolutional neural networks (CNNs), have led to significant improvement over previous semantic segmentation systems. Here we show how to improve pixel-wise semantic segmentation by manipulating convolution-related operations that are of both theoretical and practical value. First, we design dense upsampling convolution (DUC) to generate pixel-level prediction, which is able to capture and decode more detailed information that is generally missing in bilinear upsampling. Second, we propose a hybrid dilated convolution (HDC) framework in the encoding phase. This framework 1) effectively enlarges the receptive fields (RF) of the network to aggregate global information; 2) alleviates what we call the "gridding issue" caused by the standard dilated convolution operation. We evaluate our approaches thoroughly on the Cityscapes dataset, and achieve a state-of-art result of 80.1% mIOU in the test set at the time of submission. We also have achieved state-of-the-art overall on the KITTI road estimation benchmark and the PASCAL VOC2012 segmentation task. Our source code can be found at https://github.com/TuSimple/TuSimple-DUC .
citation-role summary
citation-polarity summary
fields
cs.CV 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
MWCNN integrates wavelet transforms into CNNs for image restoration tasks like denoising and super-resolution by using wavelet downsampling and inverse transforms to maintain resolution and expand context.
citing papers explorer
-
Rethinking Atrous Convolution for Semantic Image Segmentation
DeepLabv3 improves semantic segmentation by capturing multi-scale context with cascaded or parallel atrous convolutions and adding global context to ASPP, achieving better results on PASCAL VOC 2012 without DenseCRF post-processing.
-
Multi-level Wavelet Convolutional Neural Networks
MWCNN integrates wavelet transforms into CNNs for image restoration tasks like denoising and super-resolution by using wavelet downsampling and inverse transforms to maintain resolution and expand context.