VisDA: The Visual Domain Adaptation Challenge

Xingchao Peng , Ben Usman , Neela Kaushik , Judy Hoffman , Dequan Wang , Kate Saenko

Authors on Pith no claims yet

classification 💻 cs.CV

keywords domainadaptationimagevisualacrosschallengedatasetdomains

read the original abstract

We present the 2017 Visual Domain Adaptation (VisDA) dataset and challenge, a large-scale testbed for unsupervised domain adaptation across visual domains. Unsupervised domain adaptation aims to solve the real-world problem of domain shift, where machine learning models trained on one domain must be transferred and adapted to a novel visual domain without additional supervision. The VisDA2017 challenge is focused on the simulation-to-reality shift and has two associated tasks: image classification and image segmentation. The goal in both tracks is to first train a model on simulated, synthetic data in the source domain and then adapt it to perform well on real image data in the unlabeled test domain. Our dataset is the largest one to date for cross-domain object classification, with over 280K images across 12 categories in the combined training, validation and testing domains. The image segmentation dataset is also large-scale with over 30K images across 18 categories in the three domains. We compare VisDA to existing cross-domain adaptation datasets and provide a baseline performance analysis using various domain adaptation models that are currently popular in the field.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Rethinking the Need for Source Models: Source-Free Domain Adaptation from Scratch Guided by a Vision-Language Model
cs.CV 2026-05 unverdicted novelty 7.0

The paper introduces the VODA setting for domain adaptation from scratch using vision-language models and presents TS-DRD, which achieves competitive performance on standard benchmarks without source models.
Source-Free Domain Adaptation with Vision-Language Prior
cs.CV 2026-04 unverdicted novelty 7.0

DIFO++ adapts source-free models by customizing CLIP-like vision-language models through prompt-based mutual information maximization and distilling to the target model with gap region reduction and pseudo-label fusion.
Locality-aware Private Class Identification for Domain Adaptation with Extreme Label Shift
cs.AI 2026-05 unverdicted novelty 6.0

ReOT uses a theoretically justified locality-aware optimal transport score to identify private classes under extreme label shift and then performs reliable intra-class knowledge transfer while minimizing a generalizat...
All in One: A Unified Synthetic Data Pipeline for Multimodal Video Understanding
cs.CV 2026-04 unverdicted novelty 6.0

A unified synthetic data generation pipeline produces unlimited annotated multimodal video data across multiple tasks, enabling models trained mostly on synthetic data to generalize effectively to real-world video und...
Adaptive Dual-Teacher Distillation with Subnetwork Rectification for Bridging Semantic Gaps in Black-Box Domain Adaptation
cs.CV 2026-03 unverdicted novelty 6.0

DDSR adaptively fuses black-box and ViL predictions with subnetwork rectification and self-training to outperform prior black-box domain adaptation methods on benchmarks.