TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

Alexey Shvets; Vladimir Iglovikov

arxiv: 1801.05746 · v1 · pith:DGNPPRKZnew · submitted 2018-01-17 · 💻 cs.CV

TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

Vladimir Iglovikov , Alexey Shvets This is my paper

classification 💻 cs.CV

keywords networkpre-trainedencoderimagesegmentationu-netweightsarchitecture

0 comments

read the original abstract

Pixel-wise image segmentation is demanding task in computer vision. Classical U-Net architectures composed of encoders and decoders are very popular for segmentation of medical images, satellite images etc. Typically, neural network initialized with weights from a network pre-trained on a large data set like ImageNet shows better performance than those trained from scratch on a small dataset. In some practical applications, particularly in medicine and traffic safety, the accuracy of the models is of utmost importance. In this paper, we demonstrate how the U-Net type architecture can be improved by the use of the pre-trained encoder. Our code and corresponding pre-trained weights are publicly available at https://github.com/ternaus/TernausNet. We compare three weight initialization schemes: LeCun uniform, the encoder with weights from VGG11 and full network trained on the Carvana dataset. This network architecture was a part of the winning solution (1st out of 735) in the Kaggle: Carvana Image Masking Challenge.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography
eess.IV 2019-07 accept novelty 5.0

U-Net with combined binary cross-entropy and soft Jaccard loss segments the tidemark in PTA-stained micro-CT images, achieving IoU scores of 0.59–0.86 at increasing padded distances on cross-validation of 35 samples.
Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation
cs.LG 2019-07 unverdicted novelty 5.0

Generates large labeled realistic laparoscopic image datasets from simulations using extended unpaired translation and demonstrates use for liver segmentation achieving Dice scores up to 0.89 without any real labeled data.