Fully Convolutional Adaptation Networks for Semantic Segmentation

Dong Liu; Tao Mei; Ting Yao; Yiheng Zhang; Zhaofan Qiu

arxiv: 1804.08286 · v1 · pith:ILTFQ4HNnew · submitted 2018-04-23 · 💻 cs.CV

Fully Convolutional Adaptation Networks for Semantic Segmentation

Yiheng Zhang , Zhaofan Qiu , Ting Yao , Dong Liu , Tao Mei This is my paper

classification 💻 cs.CV

keywords adaptationdomainnetworksimagessegmentationsemanticconvolutionaldatasets

0 comments

read the original abstract

The recent advances in deep neural networks have convincingly demonstrated high capability in learning vision models on large datasets. Nevertheless, collecting expert labeled datasets especially with pixel-level annotations is an extremely expensive process. An appealing alternative is to render synthetic data (e.g., computer games) and generate ground truth automatically. However, simply applying the models learnt on synthetic images may lead to high generalization error on real images due to domain shift. In this paper, we facilitate this issue from the perspectives of both visual appearance-level and representation-level domain adaptation. The former adapts source-domain images to appear as if drawn from the "style" in the target domain and the latter attempts to learn domain-invariant representations. Specifically, we present Fully Convolutional Adaptation Networks (FCAN), a novel deep architecture for semantic segmentation which combines Appearance Adaptation Networks (AAN) and Representation Adaptation Networks (RAN). AAN learns a transformation from one domain to the other in the pixel space and RAN is optimized in an adversarial learning manner to maximally fool the domain discriminator with the learnt source and target representations. Extensive experiments are conducted on the transfer from GTA5 (game videos) to Cityscapes (urban street scenes) on semantic segmentation and our proposal achieves superior results when comparing to state-of-the-art unsupervised adaptation techniques. More remarkably, we obtain a new record: mIoU of 47.5% on BDDS (drive-cam videos) in an unsupervised setting.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

How much real data do we actually need: Analyzing object detection performance using synthetic and real data
cs.CV 2019-07 unverdicted novelty 3.0

Synthetic data can partially substitute for real data in object detection training, with performance tied to domain similarity and the volume of real data included.