ParseNet: Looking Wider to See Better
read the original abstract
We present a technique for adding global context to deep convolutional networks for semantic segmentation. The approach is simple, using the average feature for a layer to augment the features at each location. In addition, we study several idiosyncrasies of training, significantly increasing the performance of baseline networks (e.g. from FCN). When we add our proposed global feature, and a technique for learning normalization parameters, accuracy increases consistently even over our improved versions of the baselines. Our proposed approach, ParseNet, achieves state-of-the-art performance on SiftFlow and PASCAL-Context with small additional computational cost over baselines, and near current state-of-the-art performance on PASCAL VOC 2012 semantic segmentation with a simple approach. Code is available at https://github.com/weiliu89/caffe/tree/fcn .
This paper has not been read by Pith yet.
Forward citations
Cited by 5 Pith papers
-
Rethinking Atrous Convolution for Semantic Image Segmentation
DeepLabv3 improves semantic segmentation by capturing multi-scale context with cascaded or parallel atrous convolutions and adding global context to ASPP, achieving better results on PASCAL VOC 2012 without DenseCRF p...
-
Attention-Mamba: A Mamba-Enhanced Multi-Scale Parallel Inference Network for Medical Image Segmentation
Attention-Mamba uses parallel branches, Recursive Alignment Module, and Mamba-enhanced attention to report highest segmentation accuracy on Synapse, ACDC, ISIC-2018, and PH2 with 14.05M parameters and 8.94 GFLOPs.
-
Improving Semantic Segmentation via Dilated Affinity
Dilated affinity is jointly predicted with segmentation labels to strengthen features and support efficient label propagation refinement on benchmark datasets.
-
Adaptive Context Encoding Module for Semantic Segmentation
Proposes ACE module with three deformable convolution blocks that outperforms PPM and ASPP on Pascal-Context and ADE20K datasets for semantic segmentation.
-
Learning Where to Look While Tracking Instruments in Robot-assisted Surgery
An end-to-end multitask model with shared encoder, separate decoders, batch-Wasserstein loss, and soft attention module reports better performance than prior segmentation and saliency methods on the MICCAI robotic ins...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.