pith. sign in

arxiv: 1907.03644 · v2 · pith:SKELFVSAnew · submitted 2019-07-08 · 💻 cs.CV

Unsupervised Domain Alignment to Mitigate Low Level Dataset Biases

Pith reviewed 2026-05-25 01:07 UTC · model grok-4.3

classification 💻 cs.CV
keywords dataset biasdomain adaptationgenerative adversarial networkscycle consistency lossSSIMunsupervised domain alignmentimage augmentation
0
0 comments X

The pith

A generative network learns a mapping from biased training images to the target test domain while preserving labels to reduce dataset bias.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to mitigate the effects of dataset bias in computer vision models by augmenting the training data through a learned transformation that aligns it with the test domain. This is done using a generative adversarial network trained with cycle consistency and adversarial losses, along with an SSIM loss to keep the original labels intact. A sympathetic reader would care if this works because models often fail to generalize across different image collections due to low-level biases, and an unsupervised method could improve deployment without needing target labels. If the claim holds, training on the transformed data would lead to better performance on unseen domains by addressing the bias at the source.

Core claim

The central claim is that a non-linear mapping from the source domain to the target domain can be learned using cycle consistency loss and adversarial loss for generative adversarial networks, with an additional structured similarity index loss to enforce label retention, thereby augmenting the training set to mitigate low level dataset biases.

What carries the argument

Generative network using cycle consistency, adversarial, and SSIM losses for unsupervised domain mapping with label preservation.

Load-bearing premise

The combination of losses will produce a mapping that retains semantic labels while aligning distributions without any target domain supervision.

What would settle it

If a model trained on the augmented data does not show improved accuracy on the target domain compared to training on the original data, the method's effectiveness would be called into question.

Figures

Figures reproduced from arXiv: 1907.03644 by Kirthi Shankar Sivamani.

Figure 1
Figure 1. Figure 1: Visualizations of the images from the source, [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The black and white images are original MNIST [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: T-SNE visualization of the MNIST, MNISTM, [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

Dataset bias is a well-known problem in the field of computer vision. The presence of implicit bias in any image collection hinders a model trained and validated on a particular dataset to yield similar accuracies when tested on other datasets. In this paper, we propose a novel debiasing technique to reduce the effects of a biased training dataset. Our goal is to augment the training data using a generative network by learning a non-linear mapping from the source domain (training set) to the target domain (testing set) while retaining training set labels. The cycle consistency loss and adversarial loss for generative adversarial networks are used to learn the mapping. A structured similarity index (SSIM) loss is used to enforce label retention while augmenting the training set. Our methods and hypotheses are supported by quantitative comparisons with prior debiasing techniques. These comparisons showcase the superiority of our method and its potential to mitigate the effects of dataset bias during the inference stage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper proposes an unsupervised debiasing technique that augments a source-domain training set by learning a non-linear mapping to the target domain via a generative network. The mapping is trained with standard CycleGAN losses (adversarial + cycle consistency) plus an additional SSIM term between source images and their generated counterparts, with the SSIM term intended to enforce retention of the original training labels. Quantitative comparisons against prior debiasing methods are presented to support superiority.

Significance. If the SSIM-augmented mapping can be shown to preserve semantic labels, the approach would supply a practical, label-free way to reduce low-level dataset biases at training time. The explicit addition of an SSIM regularizer to CycleGAN is a modest but concrete technical contribution that could be useful in settings where target labels are unavailable.

major comments (1)
  1. [Method (loss definition)] Method section (loss formulation): the central claim that the SSIM term 'enforce[s] label retention' rests on the untested assumption that low-level structural similarity between x_S and G(x_S) implies invariance of the semantic class label. SSIM penalizes changes in luminance, contrast and local structure but supplies no class-level supervision; without target labels or an auxiliary classifier audit of the generated images, nothing prevents G from mapping a source 'cat' image to a structurally similar but semantically different object that is more common in the target domain. This assumption is load-bearing for the entire augmentation pipeline and is not addressed by the cycle-consistency or adversarial terms alone.
minor comments (1)
  1. [Abstract] Abstract: the statement that 'quantitative comparisons ... showcase the superiority of our method' should name the specific metrics (accuracy, mAP, etc.) and datasets used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address the major comment on the loss formulation and the underlying assumption regarding label retention below.

read point-by-point responses
  1. Referee: [Method (loss definition)] Method section (loss formulation): the central claim that the SSIM term 'enforce[s] label retention' rests on the untested assumption that low-level structural similarity between x_S and G(x_S) implies invariance of the semantic class label. SSIM penalizes changes in luminance, contrast and local structure but supplies no class-level supervision; without target labels or an auxiliary classifier audit of the generated images, nothing prevents G from mapping a source 'cat' image to a structurally similar but semantically different object that is more common in the target domain. This assumption is load-bearing for the entire augmentation pipeline and is not addressed by the cycle-consistency or adversarial terms alone.

    Authors: We agree that the SSIM term relies on the assumption that preserving low-level structural similarity will help retain semantic labels when the primary domain differences are low-level biases (e.g., color, texture, or illumination shifts). The cycle-consistency and adversarial losses constrain the mapping but do not explicitly enforce semantic invariance, as noted. In the revised manuscript we will explicitly state this assumption in the method section, discuss its scope and potential limitations (including the possibility of semantic drift), and add a brief analysis of why the combination of losses is expected to be effective for low-level bias mitigation. We will also include an auxiliary experiment auditing generated images with a classifier trained on the source domain to provide empirical support where feasible. revision: partial

Circularity Check

0 steps flagged

No circularity; method is a direct application of standard losses without reduction to inputs

full rationale

The paper presents an unsupervised domain alignment approach that combines adversarial loss, cycle consistency loss, and an added SSIM term to map source to target while claiming label retention. No equations, fitted parameters, or self-citations are shown to reduce the central claim to a tautology or prior fitted quantity by construction. The derivation chain consists of standard CycleGAN components plus a new loss term, with superiority asserted via external quantitative comparisons rather than internal self-reference. This is self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. Standard GAN training assumptions are implicit but not detailed.

pith-pipeline@v0.9.0 · 5680 in / 1102 out tokens · 22873 ms · 2026-05-25T01:07:50.395099+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 3 internal anchors

  1. [1]

    Generative adversarial nets,

    I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27, 2014

  2. [2]

    Energy-based generative adversarial network,

    J. J. Zhao, M. Mathieu, and Y . LeCun, “Energy-based generative adversarial network,” in ICLR, 2017

  3. [3]

    Unsupervised representation learning with deep convolutional generative adversarial networks,

    A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” in ICLR, 2016

  4. [4]

    Disentangling factors of variation in deep representations using adversarial training,

    M. Mathieu, J. J. Zhao, P. Sprechmann, A. Ramesh, and Y . LeCun, “Disentangling factors of variation in deep representations using adversarial training,” in Advances in Neural Information Processing Systems, 2016

  5. [5]

    Improved techniques for training gans,

    T. Salimans, I. J. Goodfellow, W. Zaremba, V . Cheung, A. Radford, and X. Chen, “Improved techniques for training gans,” inAdvances in Neural Information Processing Systems, 2016

  6. [6]

    StoryGAN: A Sequential Conditional GAN for Story Visualization

    Y . Li, Z. Gan, Y . Shen, J. Liu, Y . J. Cheng, Y . Wu, L. Carin, D. E. Carlson, and J. Gao, “Storygan: A sequential conditional gan for story visualization,” ArXiv preprint, vol. abs/1812.02784, 2018

  7. [7]

    Image style transfer using convolutional neural networks,

    L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in CVPR, 2016

  8. [8]

    Perceptual losses for real-time style transfer and super-resolution,

    J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” ECCV, 2016

  9. [9]

    Unpaired image-to-image translation using cycle-consistent adversarial networkss,

    J.-Y . Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networkss,” in ICCV, 2017

  10. [10]

    Image-to-image translation with conditional adversarial networks,

    P. Isola, J. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in CVPR, 2017

  11. [11]

    Data Augmentation Generative Adversarial Networks

    A. Antoniou, A. Storkey, and H. Edwards, “Data augmentation generative adversarial networks,” ArXiv preprint, vol. abs/1711.04340, 2018

  12. [12]

    Learning from simulated and unsupervised images through adversarial training,

    A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, and R. Webb, “Learning from simulated and unsupervised images through adversarial training,” in CVPR, 2017

  13. [13]

    X. Zhu, Y . Liu, J. Li, T. Wan, and Z. Qin, Emotion Classification with Data Augmentation Using Generative Adversarial Networks. 2018

  14. [14]

    On the difficulty of training recurrent neural networks,

    R. Pascanu, T. Mikolov, and Y . Bengio, “On the difficulty of training recurrent neural networks,” in ICML, 2013

  15. [15]

    Unsupervised visual domain adaptation using subspace alignment,

    B. Fernando, A. Habrard, M. Sebban, and T. Tuytelaars, “Unsupervised visual domain adaptation using subspace alignment,” in 2013 IEEE International Conference on Computer Vision, 2013

  16. [16]

    Object categorization by learned universal visual dictionary,

    J. Winn, A. Criminisi, and T. Minka, “Object categorization by learned universal visual dictionary,” inTenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, 2005

  17. [17]

    Synthetic data augmentation using gan for improved liver lesion classification,

    M. Frid-Adar, E. Klang, M. Amitai, J. Goldberger, and H. Greenspan, “Synthetic data augmentation using gan for improved liver lesion classification,” in ISBI, 2018

  18. [18]

    Chest x-ray generation and data augmentation for cardiovascular abnormality classification,

    M. Moradi, A. Madani, A. Karargyris, and T. F. Syeda-Mahmood, “Chest x-ray generation and data augmentation for cardiovascular abnormality classification,” in ISOP, 2018

  19. [19]

    Unbiased look at dataset bias,

    A. Torralba and A. A. Efros, “Unbiased look at dataset bias,” in CVPR 2011, 2011

  20. [20]

    Visualizing data using t-SNE,

    L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research, 2008

  21. [21]

    You only look once: Unified, real-time object detection,

    J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” CVPR, 2016

  22. [22]

    Quionero-Candela, M

    J. Quionero-Candela, M. Sugiyama, A. Schwaighofer, and N. D. Lawrence,Dataset Shift in Machine Learning. 2009

  23. [23]

    How transferable are features in deep neural networks?,

    J. Yosinski, J. Clune, Y . Bengio, and H. Lipson, “How transferable are features in deep neural networks?,” in Advances in Neural Information Processing Systems, 2014

  24. [24]

    What you saw is not what you get: Domain adaptation using asymmetric kernel transforms,

    B. Kulis, K. Saenko, and T. Darrell, “What you saw is not what you get: Domain adaptation using asymmetric kernel transforms,” in CVPR, 2011

  25. [26]

    Domain adaptation for object recognition: An unsupervised approach,

    R. Gopalan, Ruonan Li, and R. Chellappa, “Domain adaptation for object recognition: An unsupervised approach,” in ICCV, 2011

  26. [27]

    Geodesic flow kernel for unsupervised domain adaptation,

    B. Gong, Y . Shi, F. Sha, and K. Grauman, “Geodesic flow kernel for unsupervised domain adaptation,” in CVPR, 2012

  27. [28]

    Overcoming dataset bias: An unsupervised domain adaptation approach,

    B. Gong, F. Sha, and K. Grauman, “Overcoming dataset bias: An unsupervised domain adaptation approach,” in In NIPS Workshop on Large Scale Visual Recognition and Retrieval, 2012

  28. [29]

    Faster r-cnn: Towards real-time object detection with region proposal networks,

    S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, 2015

  29. [30]

    Reading digits in natural images with unsupervised feature learning,

    Y . Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y Ng, “Reading digits in natural images with unsupervised feature learning,” 2011

  30. [31]

    Loss functions for image restoration with neural networks,

    H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with neural networks,” IEEE Transactions on Computational Imaging, 2017

  31. [32]

    Adapting visual category models to new domains,

    K. Saenko, B. Kulis, M. Fritz, and T. Darrell, “Adapting visual category models to new domains,” in ECCV, 2010

  32. [33]

    Domain-adversarial training of neural networks,

    Y . Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V . Lempitsky, “Domain-adversarial training of neural networks,” Journal of Machine Learning Research, 2016

  33. [34]

    Histograms of oriented gradients for human detection,

    N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005

  34. [35]

    Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

    K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” CoRR, vol. abs/1312.6034, 2013

  35. [36]

    A deeper look at dataset bias,

    T. Tommasi, N. Patricia, B. Caputo, and T. Tuytelaars, “A deeper look at dataset bias,” in GCPR, 2015

  36. [37]

    Active adversarial domain adaptation,

    J.-C. Su, Y .-H. Tsai, K. Sohn, B. Liu, S. Maji, and M. K. Chandraker, “Active adversarial domain adaptation,” ArXiv, vol. abs/1904.07848, 2019

  37. [38]

    Deep adversarial attention alignment for unsupervised domain adaptation: The benefit of target expectation maximization,

    G. Kang, L. Zheng, Y . Yan, and Y . Yang, “Deep adversarial attention alignment for unsupervised domain adaptation: The benefit of target expectation maximization,” in ECCV, 2018

  38. [39]

    Learning transferable features with deep adaptation networks,

    M. Long, Y . Cao, J. Wang, and M. I. Jordan, “Learning transferable features with deep adaptation networks,” in ICML 37, 2015

  39. [40]

    Unsupervised domain adaptation for semantic segmentation with gans,

    S. Sankaranarayanan, Y . Balaji, A. Jain, S. Lim, and R. Chellappa, “Unsupervised domain adaptation for semantic segmentation with gans,” CVPR, 2018

  40. [41]

    Domain adaptation with randomized multilinear adversarial networks,

    M. Long, Z. Cao, J. Wang, and M. I. Jordan, “Domain adaptation with randomized multilinear adversarial networks,” Advances in Neural Information Processing Systems, 2018

  41. [42]

    Gradient-based learning applied to document recognition,

    Y . Lecun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, 1998

  42. [43]

    Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories,

    Li Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories,” in CVPR, 2004

  43. [44]

    Deep residual learning for image recognition,

    K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2016

  44. [45]

    Adam: A method for stochastic optimization,

    D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2015

  45. [46]

    Batch normalization: Accelerating deep network training by reducing internal covariate shift,

    S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” ICML, 2015

  46. [47]

    Rectified linear units improve restricted boltzmann machines,

    V . Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in ICML, 2010

  47. [48]

    A guide to convolution arithmetic for deep learning,

    V . Dumoulin and F. Visin, “A guide to convolution arithmetic for deep learning,” ArXiv preprint, 2018

  48. [49]

    ImageNet: A Large-Scale Hierarchical Image Database,

    J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A Large-Scale Hierarchical Image Database,” in CVPR, 2009

  49. [50]

    Sun database: Large-scale scene recognition from abbey to zoo,

    J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, “Sun database: Large-scale scene recognition from abbey to zoo,” in CVPR, 2010

  50. [51]

    Labelme: A database and web-based tool for image annotation,

    B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman, “Labelme: A database and web-based tool for image annotation,” in IJCV, 2008

  51. [52]

    The pascal visual object classes (voc) challenge,

    M. Everingham, L. Gool, C. K. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” in IJCV, 2010