Understanding Convolution for Semantic Segmentation

Ding Liu; Garrison Cottrell; Panqu Wang; Pengfei Chen; Xiaodi Hou; Ye Yuan; Zehua Huang

arxiv: 1702.08502 · v3 · pith:KYEZSTHYnew · submitted 2017-02-27 · 💻 cs.CV

Understanding Convolution for Semantic Segmentation

Panqu Wang , Pengfei Chen , Ye Yuan , Ding Liu , Zehua Huang , Xiaodi Hou , Garrison Cottrell This is my paper

classification 💻 cs.CV

keywords convolutionsegmentationsemanticdeepdilatedframeworkinformationupsampling

0 comments

read the original abstract

Recent advances in deep learning, especially deep convolutional neural networks (CNNs), have led to significant improvement over previous semantic segmentation systems. Here we show how to improve pixel-wise semantic segmentation by manipulating convolution-related operations that are of both theoretical and practical value. First, we design dense upsampling convolution (DUC) to generate pixel-level prediction, which is able to capture and decode more detailed information that is generally missing in bilinear upsampling. Second, we propose a hybrid dilated convolution (HDC) framework in the encoding phase. This framework 1) effectively enlarges the receptive fields (RF) of the network to aggregate global information; 2) alleviates what we call the "gridding issue" caused by the standard dilated convolution operation. We evaluate our approaches thoroughly on the Cityscapes dataset, and achieve a state-of-art result of 80.1% mIOU in the test set at the time of submission. We also have achieved state-of-the-art overall on the KITTI road estimation benchmark and the PASCAL VOC2012 segmentation task. Our source code can be found at https://github.com/TuSimple/TuSimple-DUC .

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Rethinking Atrous Convolution for Semantic Image Segmentation
cs.CV 2017-06 unverdicted novelty 6.0

DeepLabv3 improves semantic segmentation by capturing multi-scale context with cascaded or parallel atrous convolutions and adding global context to ASPP, achieving better results on PASCAL VOC 2012 without DenseCRF p...
Multi-level Wavelet Convolutional Neural Networks
cs.CV 2019-07 unverdicted novelty 4.0

MWCNN integrates wavelet transforms into CNNs for image restoration tasks like denoising and super-resolution by using wavelet downsampling and inverse transforms to maintain resolution and expand context.