arxiv: 1612.02649 · v1 · pith:7FIC3POLnew · submitted 2016-12-08 · 💻 cs.CV

FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation

Judy Hoffman , Dequan Wang , Fisher Yu , Trevor Darrell This is my paper

classification 💻 cs.CV

keywords domainadaptationadversarialdifferentacrossapproachcategorycity

0 comments

read the original abstract

Fully convolutional models for dense prediction have proven successful for a wide range of visual tasks. Such models perform well in a supervised setting, but performance can be surprisingly poor under domain shifts that appear mild to a human observer. For example, training on one city and testing on another in a different geographic region and/or weather condition may result in significantly degraded performance due to pixel-level distribution shift. In this paper, we introduce the first domain adaptive semantic segmentation method, proposing an unsupervised adversarial approach to pixel prediction problems. Our method consists of both global and category specific adaptation techniques. Global domain alignment is performed using a novel semantic segmentation network with fully convolutional domain adversarial learning. This initially adapted space then enables category specific adaptation through a generalization of constrained weak learning, with explicit transfer of the spatial layout from the source to the target domains. Our approach outperforms baselines across different settings on multiple large-scale datasets, including adapting across various real city environments, different synthetic sub-domains, from simulated to real environments, and on a novel large-scale dash-cam dataset.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Adapting Foundation Models for Annotation-Efficient Adnexal Mass Segmentation in Cine Images
cs.CV 2026-04 conditional novelty 5.0

DINOv3-based model reaches Dice 0.945 on adnexal mass segmentation in 7777 ultrasound frames and holds performance when trained on only 25% of the data.