Controlling biases and diversity in diverse image-to-image translation

Abel Gonzalez-Garcia; Joost Van De Weijer; Luis Herranz; Yaxing Wang

arxiv: 1907.09754 · v1 · pith:IRQRVVYJnew · submitted 2019-07-23 · 💻 cs.CV

Controlling biases and diversity in diverse image-to-image translation

Yaxing Wang , Abel Gonzalez-Garcia , Joost Van De Weijer , Luis Herranz This is my paper

Pith reviewed 2026-05-24 17:42 UTC · model grok-4.3

classification 💻 cs.CV

keywords image-to-image translationdiverse translationbiassemantic constraintsunpaired learningface translationobject translation

0 comments

The pith

Semantic constraints reduce unwanted biases in diverse image-to-image translation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies bias in diverse unpaired image-to-image translation, where models trained on skewed datasets introduce extra unwanted alterations such as shifts in gender or race when translating faces. It proposes semantic constraints that force the model to keep specific image properties fixed during the translation. These constraints are intended to cut the undesired changes while leaving the intended domain shift and the range of possible outputs intact. Tests on several heavily biased collections covering faces, objects and scenes indicate that the constraints achieve this separation of wanted and unwanted effects.

Core claim

By imposing semantic constraints that enforce the preservation of desired image properties, the model produces translations with fewer unwanted changes while still performing the wanted transformation, serving as a step towards unbiased diverse image-to-image translation.

What carries the argument

Semantic constraints that enforce preservation of desired image properties while sampling different style codes in a disentangled content-style latent space.

Load-bearing premise

That the observed biases arise primarily from the visual distribution of the target domain and that semantic constraints can be imposed without degrading translation quality or diversity.

What would settle it

A quantitative test on the face datasets in which the constrained model still produces measurable gender or race shifts at rates comparable to the unconstrained baseline would falsify the claim that the constraints successfully limit unwanted changes.

Figures

Figures reproduced from arXiv: 1907.09754 by Abel Gonzalez-Garcia, Joost Van De Weijer, Luis Herranz, Yaxing Wang.

**Figure 1.** Figure 1: Diverse image-to-image translation in a very biased setting (domain A: mostly white males without makeup, domain B: white females with makeup): (a) biased translations, (b) with semantic constraint to alleviate bias while keeping relevant diversity. possible to design better and more balanced datasets, or at least understand the related biases, their nature and try to incorporate tools to alleviate them … view at source ↗

**Figure 2.** Figure 2: Examples of training sets for image translation: (a) paired edge-photo, (b) unpaired young-old (well-aligned biases), and (c) unpaired without-with makeup (misaligned in gender). generative models, Mathieu et al. [30] combined a GAN with a Variational Autoencoder (VAE) to obtain an internal representation that is disentangled across specified (e.g. labels) and unspecified factors. InfoGAN [9] achieves so… view at source ↗

**Figure 3.** Figure 3: Geometric interpretation of the semantic constraint unbiasing [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Diverse image-to-image translation (DIT): (a) biased, (b) [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Robustness to bias on Biased makeup: (left) misclassification rate, (middle) drop in confidence, (right) ID distance. Input Direction MUNIT +PI DRIT UDIT UDIT+PI M Makeup 0.268 0.267 0.263 0.192 0.151 F Makeup 0.212 0.199 0.193 0.154 0.133 F Demakeup 0.297 0.293 0.253 0.208 0.203 [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Example translations for Biased makeup when applying makeup to a male. UDIT uses identity as semantic constraint. than preserving gender, and implicitly also preserves it. For this reason, we use a semantic constraint based on identity (ID). We consider an off-the shelf network for face recognition [31] and select its highest level convolutional features as semantic feature. The model has been trained with… view at source ↗

**Figure 7.** Figure 7: Example translations on MORPH by biased DIT methods (MUNIT/DRIT) and our UDIT with semantic constraint on identity. ate over the gender that is underrepresented in the target domain. These results confirm the trends observed qualitatively in [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Robustness to bias on MORPH: (a)young to old and (b)old to young: (left) misclassification rate, (middle) drop in confidence, and (right) ID distance [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: Robustness to bias in terms of misclassification rate and drop in confidence . In the case of the young female, gender is almost always changed due to the extreme bias towards males. UDIT, on the other hand, preserves the wanted semantic properties and outputs diversity without unwanted changes. Robustness to unwanted changes. Here we evaluate how the identity constraint impacts gender and ethnicity change… view at source ↗

**Figure 7.** Figure 7: 6.6. Cityscapes → Synthia-night Semantic constraint. We train a binary classifier for daytime classification based on VGG16 [38] using both real and synthetic images. We use 6000 realistic images from BDD-100K [44] with a 50/50 daytime distribution. As synthetic images we use 6000 images from a disjoint subset of Synthia [36], also with a balanced class distribution. We consider two semantic constraints. … view at source ↗

**Figure 10.** Figure 10: Results on Cityscapes → Synthia-night. Example translations by MUNIT and UDIT with two variants of the semantic constraint. increases the required dimensionality on the semantic features. Results [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 11.** Figure 11: Robustness to bias on Biased handbags [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗

**Figure 12.** Figure 12: Example translations for Handbags-texture (left) and Handbags-color (right). Better viewed electronically, zoom might be necessary to appreciate the changes in texture. [5] Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D., 2017. Unsupervised pixel-level domain adaptation with generative adversarial networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognitio… view at source ↗

read the original abstract

The task of unpaired image-to-image translation is highly challenging due to the lack of explicit cross-domain pairs of instances. We consider here diverse image translation (DIT), an even more challenging setting in which an image can have multiple plausible translations. This is normally achieved by explicitly disentangling content and style in the latent representation and sampling different styles codes while maintaining the image content. Despite the success of current DIT models, they are prone to suffer from bias. In this paper, we study the problem of bias in image-to-image translation. Biased datasets may add undesired changes (e.g. change gender or race in face images) to the output translations as a consequence of the particular underlying visual distribution in the target domain. In order to alleviate the effects of this problem we propose the use of semantic constraints that enforce the preservation of desired image properties. Our proposed model is a step towards unbiased diverse image-to-image translation (UDIT), and results in less unwanted changes in the translated images while still performing the wanted transformation. Experiments on several heavily biased datasets show the effectiveness of the proposed techniques in different domains such as faces, objects, and scenes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Semantic constraints are added to diverse image-to-image translation to cut unwanted attribute shifts from biased data, but the abstract shows no numbers to judge the results.

read the letter

The main point is that the authors add semantic constraints on top of content-style disentanglement so that diverse translation models stop flipping attributes like gender or race when the target domain distribution is skewed. They run tests on biased datasets covering faces, objects, and scenes and frame the result as a step toward unbiased diverse image-to-image translation. The direction makes sense because the bias problem they describe is a real one that shows up in practice with these models. It does a decent job of naming the issue and proposing a targeted addition that tries to keep the wanted translation and the output diversity. The soft spot is the evidence. The abstract claims effectiveness but gives no quantitative results, no ablations, and no checks on whether diversity or translation quality drops. Without those details it is hard to know if the constraints deliver or just add overhead. The assumption that biases come mainly from the target domain and can be fixed cleanly with external terms may not hold if the attributes are entangled with the translation task itself. This paper is for people already working on image translation and fairness in generative models. A reader looking for a fully supported new method will find it light on data. It deserves a serious referee to examine the full experiments and see whether the claims stand up.

Referee Report

1 major / 1 minor

Summary. The paper addresses the problem of bias in diverse unpaired image-to-image translation (DIT), where target-domain statistics can induce unwanted attribute changes (e.g., gender or race in face images). It proposes adding semantic constraints to existing disentanglement-based DIT models to enforce preservation of desired properties, yielding a model for unbiased diverse image-to-image translation (UDIT) that reduces such changes while retaining the intended translation and output diversity. The abstract states that experiments on multiple heavily biased datasets across faces, objects, and scenes demonstrate the effectiveness of the approach.

Significance. If the semantic-constraint mechanism can be shown to measurably reduce unwanted attribute shifts without loss of translation fidelity or diversity, the work would provide a practical extension of disentanglement techniques that directly targets fairness issues in generative vision models. The absence of any reported metrics, however, prevents evaluation of whether the central claim holds.

major comments (1)

[Abstract] Abstract: the claim that 'experiments on several heavily biased datasets show the effectiveness of the proposed techniques' supplies no quantitative results, ablation studies, error analysis, or even example metrics (e.g., attribute classification accuracy before/after, diversity scores such as LPIPS, or FID). Without such evidence the data-to-claim link cannot be assessed and the central assertion that unwanted changes are reduced while wanted transformations and diversity are preserved remains unevaluated.

minor comments (1)

[Abstract] Abstract: the acronym 'UDIT' is introduced before its expansion ('unbiased diverse image-to-image translation') is given, which is a minor clarity issue.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and constructive feedback. We address the major comment below and will revise the manuscript accordingly to strengthen the presentation of our results.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'experiments on several heavily biased datasets show the effectiveness of the proposed techniques' supplies no quantitative results, ablation studies, error analysis, or even example metrics (e.g., attribute classification accuracy before/after, diversity scores such as LPIPS, or FID). Without such evidence the data-to-claim link cannot be assessed and the central assertion that unwanted changes are reduced while wanted transformations and diversity are preserved remains unevaluated.

Authors: We agree that the abstract would be strengthened by including key quantitative metrics. The full manuscript reports attribute classification accuracies (to quantify reduction in unwanted bias-induced changes), LPIPS scores (for diversity), and FID (for translation quality) in the experiments section across the evaluated datasets. We will revise the abstract to explicitly cite representative results (e.g., X% reduction in attribute shift with no loss in LPIPS or FID) while preserving the overall length and readability. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes extending existing diverse I2I translation frameworks (e.g., via disentanglement of content/style) with an added semantic constraint term to reduce unwanted attribute shifts induced by target-domain statistics. The central claim rests on experimental validation across biased datasets rather than any closed-form derivation, fitted parameter renamed as prediction, or self-citation chain. No equations, uniqueness theorems, or ansatzes are described that reduce the method to its inputs by construction; the approach is presented as an empirical extension with external constraints, consistent with the reader's assessment of score 1.0.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are identifiable from the provided text. The approach implicitly relies on standard assumptions of disentangled latent representations in GAN-based translation models.

pith-pipeline@v0.9.0 · 5736 in / 1095 out tokens · 33737 ms · 2026-05-24T17:42:34.827855+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · 2 internal anchors

[1]

Augmented cyclegan: Learning many-to- many mappings from unpaired data, in: International Confer- ence on Machine Learning

Almahairi, A., Rajeswar, S., Sordoni, A., Bachman, P., Courville, A., 2018. Augmented cyclegan: Learning many-to- many mappings from unpaired data, in: International Confer- ence on Machine Learning

work page 2018
[2]

Segnet: A deep convolutional encoder-decoder architecture for image segmentation

Badrinarayanan, V., Kendall, A., Cipolla, R., 2017. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence

work page 2017
[3]

Representation learning: A review and new perspectives

Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence

work page 2013
[4]

Pros and cons of gan evaluation measures

Borji, A., 2019. Pros and cons of gan evaluation measures. Computer Vision and Image Understanding 179, 41–65. 11 Fig. 11: Robustness to bias on Biased handbags. Fig. 12: Example translations for Handbags-texture (left) and Handbags-color (right). Better viewed electronically, zoom might be necessary to appreciate the changes in texture

work page 2019
[5]

Unsupervised pixel-level domain adaptation with gen- erative adversarial networks, in: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition

Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D., 2017. Unsupervised pixel-level domain adaptation with gen- erative adversarial networks, in: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition

work page 2017
[6]

Domain separation networks, in: Advances in Neural Information Processing Systems

Bousmalis, K., Trigeorgis, G., Silberman, N., Krishnan, D., Er- han, D., 2016. Domain separation networks, in: Advances in Neural Information Processing Systems

work page 2016
[7]

Learn to synthesize and synthesize to learn

Bozorgtabar, B., Rad, M.S., Ekenel, H.K., Thiran, J.P., 2019. Learn to synthesize and synthesize to learn. Computer Vision and Image Understanding

work page 2019
[8]

Gender shades: Intersec- tional accuracy disparities in commercial gender classiﬁcation, in: Conference on Fairness, Accountability and Transparency, pp

Buolamwini, J., Gebru, T., 2018. Gender shades: Intersec- tional accuracy disparities in commercial gender classiﬁcation, in: Conference on Fairness, Accountability and Transparency, pp. 77–91

work page 2018
[9]

Infogan: Interpretable representation learn- ing by information maximizing generative adversarial nets, in: Advances in Neural Information Processing Systems

Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P., 2016. Infogan: Interpretable representation learn- ing by information maximizing generative adversarial nets, in: Advances in Neural Information Processing Systems

work page 2016
[10]

The cityscapes dataset for semantic urban scene understanding, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B., 2016. The cityscapes dataset for semantic urban scene understanding, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

work page 2016
[11]

Frustratingly easy domain adaptation

Daum´ e III, H., 2007. Frustratingly easy domain adaptation. Proceedings of the Annual Meeting of the Association of Com- putational Linguistics

work page 2007
[12]

Unbiased metric learn- ing: On the utilization of multiple datasets and web images for softening bias, in: Proceedings of the International Conference on Computer Vision, pp

Fang, C., Xu, Y., Rockmore, D.N., 2013. Unbiased metric learn- ing: On the utilization of multiple datasets and web images for softening bias, in: Proceedings of the International Conference on Computer Vision, pp. 1657–1664

work page 2013
[13]

Unsupervised domain adapta- tion by backpropagation, in: International Conference on Ma- chine Learning

Ganin, Y., Lempitsky, V., 2015. Unsupervised domain adapta- tion by backpropagation, in: International Conference on Ma- chine Learning

work page 2015
[14]

Image- to-image translation for cross-domain disentanglement, in: Ad- vances in Neural Information Processing Systems

Gonzalez-Garcia, A., van de Weijer, J., Bengio, Y., 2018. Image- to-image translation for cross-domain disentanglement, in: Ad- vances in Neural Information Processing Systems

work page 2018
[15]

Generative adversarial nets, in: Advances in Neural Information Processing Systems

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde- Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets, in: Advances in Neural Information Processing Systems

work page 2014
[16]

Women also snowboard: Overcoming bias in cap- tioning models, in: Proceedings of the European Conference on Computer Vision, Springer

Hendricks, L.A., Burns, K., Saenko, K., Darrell, T., Rohrbach, A., 2018. Women also snowboard: Overcoming bias in cap- tioning models, in: Proceedings of the European Conference on Computer Vision, Springer. pp. 793–811

work page 2018
[17]

Scene recognition with cnns: objects, scales and dataset bias, in: Proceedings of the IEEE 12 Conference on Computer Vision and Pattern Recognition, pp

Herranz, L., Jiang, S., Li, X., 2016. Scene recognition with cnns: objects, scales and dataset bias, in: Proceedings of the IEEE 12 Conference on Computer Vision and Pattern Recognition, pp. 571–579

work page 2016
[18]

Howard, A., Zhang, C., Horvitz, E., 2017. Addressing bias in machine learning algorithms: A pilot study on emotion recog- nition for intelligent systems, in: 2017 IEEE Workshop on Ad- vanced Robotics and its Social Impacts (ARSO), IEEE. pp. 1–7

work page 2017
[19]

Multimodal unsupervised image-to-image translation

Huang, X., Liu, M.Y., Belongie, S., Kautz, J., 2018. Multimodal unsupervised image-to-image translation. Proceedings of the European Conference on Computer Vision

work page 2018
[20]

Image-to-image translation with conditional adversarial networks, in: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A., 2017. Image-to-image translation with conditional adversarial networks, in: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition

work page 2017
[21]

Identifying and Correcting Label Bias in Machine Learning

Jiang, H., Nachum, O., 2019. Identifying and correcting label bias in machine learning. arXiv preprint arXiv:1901.04966

work page internal anchor Pith review Pith/arXiv arXiv 2019
[22]

Undoing the damage of dataset bias, in: Proceedings of the European Conference on Computer Vision, Springer

Khosla, A., Zhou, T., Malisiewicz, T., Efros, A.A., Torralba, A., 2012. Undoing the damage of dataset bias, in: Proceedings of the European Conference on Computer Vision, Springer. pp. 158–171

work page 2012
[23]

Learning to discover cross-domain relations with generative adversarial networks

Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J., 2017. Learning to discover cross-domain relations with generative adversarial networks. International Conference on Machine Learning

work page 2017
[24]

Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.,

work page
[25]

Proceedings of the European Conference on Com- puter Vision

Diverse image-to-image translation via disentangled rep- resentations. Proceedings of the European Conference on Com- puter Vision

work page
[26]

Automotive radar and camera fusion using generative adversarial networks

Lekic, V., Babic, Z., 2019. Automotive radar and camera fusion using generative adversarial networks. Computer Vision and Image Understanding doi: 10.1016/j.cviu.2019.04.002

work page doi:10.1016/j.cviu.2019.04.002 2019
[27]

Age and gender classiﬁcation us- ing convolutional neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Work- shops, pp

Levi, G., Hassner, T., 2015. Age and gender classiﬁcation us- ing convolutional neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Work- shops, pp. 34–42

work page 2015
[28]

Unsupervised image-to- image translation networks, in: Advances in Neural Information Processing Systems, pp

Liu, M.Y., Breuel, T., Kautz, J., 2017. Unsupervised image-to- image translation networks, in: Advances in Neural Information Processing Systems, pp. 700–708

work page 2017
[29]

Exploiting unlabeled data in cnns by self-supervised learning to rank

Liu, X., Van De Weijer, J., Bagdanov, A.D., 2019. Exploiting unlabeled data in cnns by self-supervised learning to rank. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 1862–1878

work page 2019
[30]

Detach and adapt: Learning cross-domain disen- tangled deep representation, in: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition

Liu, Y.C., Yeh, Y.Y., Fu, T.C., Wang, S.D., Chiu, W.C., Wang, Y.C.F., 2018. Detach and adapt: Learning cross-domain disen- tangled deep representation, in: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition

work page 2018
[31]

Disentangling factors of variation in deep representation using adversarial training, in: Advances in Neu- ral Information Processing Systems

Mathieu, M.F., Zhao, J.J., Zhao, J., Ramesh, A., Sprechmann, P., LeCun, Y., 2016. Disentangling factors of variation in deep representation using adversarial training, in: Advances in Neu- ral Information Processing Systems

work page 2016
[32]

Deep face recognition, in: Proceedings of the British Machine Vision Con- ference

Parkhi, O.M., Vedaldi, A., Zisserman, A., 2015. Deep face recognition, in: Proceedings of the British Machine Vision Con- ference

work page 2015
[33]

Visual domain adaptation: A survey of recent advances

Patel, V.M., Gopalan, R., Li, R., Chellappa, R., 2015. Visual domain adaptation: A survey of recent advances. IEEE signal processing magazine 32, 53–69

work page 2015
[34]

Learning to disentangle factors of variation with manifold interaction, in: International Conference on Machine Learning

Reed, S., Sohn, K., Zhang, Y., Lee, H., 2014. Learning to disentangle factors of variation with manifold interaction, in: International Conference on Machine Learning

work page 2014
[35]

Deep visual analogy-making, in: Advances in Neural Information Processing Systems

Reed, S.E., Zhang, Y., Zhang, Y., Lee, H., 2015. Deep visual analogy-making, in: Advances in Neural Information Processing Systems

work page 2015
[36]

Morph: A longitudinal image database of normal adult age-progression, in: Automatic Face and Gesture Recognition, 2006

Ricanek, K., Tesafaye, T., 2006. Morph: A longitudinal image database of normal adult age-progression, in: Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference on, IEEE. pp. 341–345

work page 2006
[37]

Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.,

work page
[38]

3234–3243

The synthia dataset: A large collection of synthetic im- ages for semantic segmentation of urban scenes, in: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3234–3243

work page
[39]

Imagenet large scale visual recognition challenge

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al., 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 211–252

work page 2015
[40]

Very Deep Convolutional Networks for Large-Scale Image Recognition

Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

work page internal anchor Pith review Pith/arXiv arXiv 2014
[41]

Unsupervised cross- domain image generation, in: International Conference on Learning Representations

Taigman, Y., Polyak, A., Wolf, L., 2017. Unsupervised cross- domain image generation, in: International Conference on Learning Representations

work page 2017
[42]

Deepface: Closing the gap to human-level performance in face veriﬁcation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp

Taigman, Y., Yang, M., Ranzato, M., Wolf, L., 2014. Deepface: Closing the gap to human-level performance in face veriﬁcation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708

work page 2014
[43]

Unbiased look at dataset bias, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE

Torralba, A., Efros, A.A., 2011. Unbiased look at dataset bias, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE. pp. 1521–1528

work page 2011
[44]

Mix and match networks: encoder-decoder alignment for zero-pair image trans- lation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Wang, Y., van de Weijer, J., Herranz, L., 2018. Mix and match networks: encoder-decoder alignment for zero-pair image trans- lation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

work page 2018
[45]

Dualgan: Unsu- pervised dual learning for image-to-image translation., in: Pro- ceedings of the International Conference on Computer Vision, pp

Yi, Z., Zhang, H.R., Tan, P., Gong, M., 2017. Dualgan: Unsu- pervised dual learning for image-to-image translation., in: Pro- ceedings of the International Conference on Computer Vision, pp. 2868–2876

work page 2017
[46]

Bdd100k: A diverse driving video database with scalable annotation tooling

Yu, F., Xian, W., Chen, Y., Liu, F., Liao, M., Madhavan, V., Darrell, T., 2018a. Bdd100k: A diverse driving video database with scalable annotation tooling. Proceedings of the European Conference on Computer Vision

work page
[47]

Weakly supervised domain-speciﬁc color naming based on attention, in: Proceed- ings of the International Conference on Pattern Recognition, IEEE

Yu, L., Cheng, Y., van de Weijer, J., 2018b. Weakly supervised domain-speciﬁc color naming based on attention, in: Proceed- ings of the International Conference on Pattern Recognition, IEEE. pp. 3019–3024

work page
[48]

Synthetic data generation for end-to- end thermal infrared tracking

Zhang, L., Gonzalez-Garcia, A., van de Weijer, J., Danelljan, M., Khan, F.S., 2018a. Synthetic data generation for end-to- end thermal infrared tracking. IEEE Transactions on Image Processing 28, 1837–1850

work page
[49]

The unreasonable eﬀectiveness of deep networks as a perceptual metric, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O., 2018b. The unreasonable eﬀectiveness of deep networks as a perceptual metric, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

work page
[50]

Bias and generalization in deep generative models: An empirical study, in: Advances in Neural Information Processing Systems, pp

Zhao, S., Ren, H., Yuan, A., Song, J., Goodman, N., Ermon, S., 2018. Bias and generalization in deep generative models: An empirical study, in: Advances in Neural Information Processing Systems, pp. 10815–10824

work page 2018
[51]

Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Zhu, J.Y., Park, T., Isola, P., Efros, A.A., 2017a. Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

work page
[52]

Toward multimodal image- to-image translation, in: Advances in Neural Information Pro- cessing Systems, pp

Zhu, J.Y., Zhang, R., Pathak, D., Darrell, T., Efros, A.A., Wang, O., Shechtman, E., 2017b. Toward multimodal image- to-image translation, in: Advances in Neural Information Pro- cessing Systems, pp. 465–476

work page
[53]

Ai can be sexist and racistits time to make it fair

Zou, J., Schiebinger, L., 2018. Ai can be sexist and racistits time to make it fair. 13 Appendix Tables 6-10 show the architectures of the content en- coder, style encoder, image decoder and discriminator used in the cross-modal experiment. The used abbreviations are shown in Table 11. Layer Input →Output Kernel, stride, padconv1 [4,128, 128,3]→[4,128, 12...

work page 2018

[1] [1]

Augmented cyclegan: Learning many-to- many mappings from unpaired data, in: International Confer- ence on Machine Learning

Almahairi, A., Rajeswar, S., Sordoni, A., Bachman, P., Courville, A., 2018. Augmented cyclegan: Learning many-to- many mappings from unpaired data, in: International Confer- ence on Machine Learning

work page 2018

[2] [2]

Segnet: A deep convolutional encoder-decoder architecture for image segmentation

Badrinarayanan, V., Kendall, A., Cipolla, R., 2017. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence

work page 2017

[3] [3]

Representation learning: A review and new perspectives

Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence

work page 2013

[4] [4]

Pros and cons of gan evaluation measures

Borji, A., 2019. Pros and cons of gan evaluation measures. Computer Vision and Image Understanding 179, 41–65. 11 Fig. 11: Robustness to bias on Biased handbags. Fig. 12: Example translations for Handbags-texture (left) and Handbags-color (right). Better viewed electronically, zoom might be necessary to appreciate the changes in texture

work page 2019

[5] [5]

Unsupervised pixel-level domain adaptation with gen- erative adversarial networks, in: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition

Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D., 2017. Unsupervised pixel-level domain adaptation with gen- erative adversarial networks, in: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition

work page 2017

[6] [6]

Domain separation networks, in: Advances in Neural Information Processing Systems

Bousmalis, K., Trigeorgis, G., Silberman, N., Krishnan, D., Er- han, D., 2016. Domain separation networks, in: Advances in Neural Information Processing Systems

work page 2016

[7] [7]

Learn to synthesize and synthesize to learn

Bozorgtabar, B., Rad, M.S., Ekenel, H.K., Thiran, J.P., 2019. Learn to synthesize and synthesize to learn. Computer Vision and Image Understanding

work page 2019

[8] [8]

Gender shades: Intersec- tional accuracy disparities in commercial gender classiﬁcation, in: Conference on Fairness, Accountability and Transparency, pp

Buolamwini, J., Gebru, T., 2018. Gender shades: Intersec- tional accuracy disparities in commercial gender classiﬁcation, in: Conference on Fairness, Accountability and Transparency, pp. 77–91

work page 2018

[9] [9]

Infogan: Interpretable representation learn- ing by information maximizing generative adversarial nets, in: Advances in Neural Information Processing Systems

Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P., 2016. Infogan: Interpretable representation learn- ing by information maximizing generative adversarial nets, in: Advances in Neural Information Processing Systems

work page 2016

[10] [10]

The cityscapes dataset for semantic urban scene understanding, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B., 2016. The cityscapes dataset for semantic urban scene understanding, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

work page 2016

[11] [11]

Frustratingly easy domain adaptation

Daum´ e III, H., 2007. Frustratingly easy domain adaptation. Proceedings of the Annual Meeting of the Association of Com- putational Linguistics

work page 2007

[12] [12]

Unbiased metric learn- ing: On the utilization of multiple datasets and web images for softening bias, in: Proceedings of the International Conference on Computer Vision, pp

Fang, C., Xu, Y., Rockmore, D.N., 2013. Unbiased metric learn- ing: On the utilization of multiple datasets and web images for softening bias, in: Proceedings of the International Conference on Computer Vision, pp. 1657–1664

work page 2013

[13] [13]

Unsupervised domain adapta- tion by backpropagation, in: International Conference on Ma- chine Learning

Ganin, Y., Lempitsky, V., 2015. Unsupervised domain adapta- tion by backpropagation, in: International Conference on Ma- chine Learning

work page 2015

[14] [14]

Image- to-image translation for cross-domain disentanglement, in: Ad- vances in Neural Information Processing Systems

Gonzalez-Garcia, A., van de Weijer, J., Bengio, Y., 2018. Image- to-image translation for cross-domain disentanglement, in: Ad- vances in Neural Information Processing Systems

work page 2018

[15] [15]

Generative adversarial nets, in: Advances in Neural Information Processing Systems

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde- Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets, in: Advances in Neural Information Processing Systems

work page 2014

[16] [16]

Women also snowboard: Overcoming bias in cap- tioning models, in: Proceedings of the European Conference on Computer Vision, Springer

Hendricks, L.A., Burns, K., Saenko, K., Darrell, T., Rohrbach, A., 2018. Women also snowboard: Overcoming bias in cap- tioning models, in: Proceedings of the European Conference on Computer Vision, Springer. pp. 793–811

work page 2018

[17] [17]

Scene recognition with cnns: objects, scales and dataset bias, in: Proceedings of the IEEE 12 Conference on Computer Vision and Pattern Recognition, pp

Herranz, L., Jiang, S., Li, X., 2016. Scene recognition with cnns: objects, scales and dataset bias, in: Proceedings of the IEEE 12 Conference on Computer Vision and Pattern Recognition, pp. 571–579

work page 2016

[18] [18]

Howard, A., Zhang, C., Horvitz, E., 2017. Addressing bias in machine learning algorithms: A pilot study on emotion recog- nition for intelligent systems, in: 2017 IEEE Workshop on Ad- vanced Robotics and its Social Impacts (ARSO), IEEE. pp. 1–7

work page 2017

[19] [19]

Multimodal unsupervised image-to-image translation

Huang, X., Liu, M.Y., Belongie, S., Kautz, J., 2018. Multimodal unsupervised image-to-image translation. Proceedings of the European Conference on Computer Vision

work page 2018

[20] [20]

Image-to-image translation with conditional adversarial networks, in: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A., 2017. Image-to-image translation with conditional adversarial networks, in: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition

work page 2017

[21] [21]

Identifying and Correcting Label Bias in Machine Learning

Jiang, H., Nachum, O., 2019. Identifying and correcting label bias in machine learning. arXiv preprint arXiv:1901.04966

work page internal anchor Pith review Pith/arXiv arXiv 2019

[22] [22]

Undoing the damage of dataset bias, in: Proceedings of the European Conference on Computer Vision, Springer

Khosla, A., Zhou, T., Malisiewicz, T., Efros, A.A., Torralba, A., 2012. Undoing the damage of dataset bias, in: Proceedings of the European Conference on Computer Vision, Springer. pp. 158–171

work page 2012

[23] [23]

Learning to discover cross-domain relations with generative adversarial networks

Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J., 2017. Learning to discover cross-domain relations with generative adversarial networks. International Conference on Machine Learning

work page 2017

[24] [24]

Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.,

work page

[25] [25]

Proceedings of the European Conference on Com- puter Vision

Diverse image-to-image translation via disentangled rep- resentations. Proceedings of the European Conference on Com- puter Vision

work page

[26] [26]

Automotive radar and camera fusion using generative adversarial networks

Lekic, V., Babic, Z., 2019. Automotive radar and camera fusion using generative adversarial networks. Computer Vision and Image Understanding doi: 10.1016/j.cviu.2019.04.002

work page doi:10.1016/j.cviu.2019.04.002 2019

[27] [27]

Age and gender classiﬁcation us- ing convolutional neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Work- shops, pp

Levi, G., Hassner, T., 2015. Age and gender classiﬁcation us- ing convolutional neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Work- shops, pp. 34–42

work page 2015

[28] [28]

Unsupervised image-to- image translation networks, in: Advances in Neural Information Processing Systems, pp

Liu, M.Y., Breuel, T., Kautz, J., 2017. Unsupervised image-to- image translation networks, in: Advances in Neural Information Processing Systems, pp. 700–708

work page 2017

[29] [29]

Exploiting unlabeled data in cnns by self-supervised learning to rank

Liu, X., Van De Weijer, J., Bagdanov, A.D., 2019. Exploiting unlabeled data in cnns by self-supervised learning to rank. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 1862–1878

work page 2019

[30] [30]

Detach and adapt: Learning cross-domain disen- tangled deep representation, in: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition

Liu, Y.C., Yeh, Y.Y., Fu, T.C., Wang, S.D., Chiu, W.C., Wang, Y.C.F., 2018. Detach and adapt: Learning cross-domain disen- tangled deep representation, in: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition

work page 2018

[31] [31]

Disentangling factors of variation in deep representation using adversarial training, in: Advances in Neu- ral Information Processing Systems

Mathieu, M.F., Zhao, J.J., Zhao, J., Ramesh, A., Sprechmann, P., LeCun, Y., 2016. Disentangling factors of variation in deep representation using adversarial training, in: Advances in Neu- ral Information Processing Systems

work page 2016

[32] [32]

Deep face recognition, in: Proceedings of the British Machine Vision Con- ference

Parkhi, O.M., Vedaldi, A., Zisserman, A., 2015. Deep face recognition, in: Proceedings of the British Machine Vision Con- ference

work page 2015

[33] [33]

Visual domain adaptation: A survey of recent advances

Patel, V.M., Gopalan, R., Li, R., Chellappa, R., 2015. Visual domain adaptation: A survey of recent advances. IEEE signal processing magazine 32, 53–69

work page 2015

[34] [34]

Learning to disentangle factors of variation with manifold interaction, in: International Conference on Machine Learning

Reed, S., Sohn, K., Zhang, Y., Lee, H., 2014. Learning to disentangle factors of variation with manifold interaction, in: International Conference on Machine Learning

work page 2014

[35] [35]

Deep visual analogy-making, in: Advances in Neural Information Processing Systems

Reed, S.E., Zhang, Y., Zhang, Y., Lee, H., 2015. Deep visual analogy-making, in: Advances in Neural Information Processing Systems

work page 2015

[36] [36]

Morph: A longitudinal image database of normal adult age-progression, in: Automatic Face and Gesture Recognition, 2006

Ricanek, K., Tesafaye, T., 2006. Morph: A longitudinal image database of normal adult age-progression, in: Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference on, IEEE. pp. 341–345

work page 2006

[37] [37]

Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.,

work page

[38] [38]

3234–3243

The synthia dataset: A large collection of synthetic im- ages for semantic segmentation of urban scenes, in: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3234–3243

work page

[39] [39]

Imagenet large scale visual recognition challenge

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al., 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 211–252

work page 2015

[40] [40]

Very Deep Convolutional Networks for Large-Scale Image Recognition

Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

work page internal anchor Pith review Pith/arXiv arXiv 2014

[41] [41]

Unsupervised cross- domain image generation, in: International Conference on Learning Representations

Taigman, Y., Polyak, A., Wolf, L., 2017. Unsupervised cross- domain image generation, in: International Conference on Learning Representations

work page 2017

[42] [42]

Deepface: Closing the gap to human-level performance in face veriﬁcation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp

Taigman, Y., Yang, M., Ranzato, M., Wolf, L., 2014. Deepface: Closing the gap to human-level performance in face veriﬁcation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708

work page 2014

[43] [43]

Unbiased look at dataset bias, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE

Torralba, A., Efros, A.A., 2011. Unbiased look at dataset bias, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE. pp. 1521–1528

work page 2011

[44] [44]

Mix and match networks: encoder-decoder alignment for zero-pair image trans- lation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Wang, Y., van de Weijer, J., Herranz, L., 2018. Mix and match networks: encoder-decoder alignment for zero-pair image trans- lation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

work page 2018

[45] [45]

Dualgan: Unsu- pervised dual learning for image-to-image translation., in: Pro- ceedings of the International Conference on Computer Vision, pp

Yi, Z., Zhang, H.R., Tan, P., Gong, M., 2017. Dualgan: Unsu- pervised dual learning for image-to-image translation., in: Pro- ceedings of the International Conference on Computer Vision, pp. 2868–2876

work page 2017

[46] [46]

Bdd100k: A diverse driving video database with scalable annotation tooling

Yu, F., Xian, W., Chen, Y., Liu, F., Liao, M., Madhavan, V., Darrell, T., 2018a. Bdd100k: A diverse driving video database with scalable annotation tooling. Proceedings of the European Conference on Computer Vision

work page

[47] [47]

Weakly supervised domain-speciﬁc color naming based on attention, in: Proceed- ings of the International Conference on Pattern Recognition, IEEE

Yu, L., Cheng, Y., van de Weijer, J., 2018b. Weakly supervised domain-speciﬁc color naming based on attention, in: Proceed- ings of the International Conference on Pattern Recognition, IEEE. pp. 3019–3024

work page

[48] [48]

Synthetic data generation for end-to- end thermal infrared tracking

Zhang, L., Gonzalez-Garcia, A., van de Weijer, J., Danelljan, M., Khan, F.S., 2018a. Synthetic data generation for end-to- end thermal infrared tracking. IEEE Transactions on Image Processing 28, 1837–1850

work page

[49] [49]

The unreasonable eﬀectiveness of deep networks as a perceptual metric, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O., 2018b. The unreasonable eﬀectiveness of deep networks as a perceptual metric, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

work page

[50] [50]

Bias and generalization in deep generative models: An empirical study, in: Advances in Neural Information Processing Systems, pp

Zhao, S., Ren, H., Yuan, A., Song, J., Goodman, N., Ermon, S., 2018. Bias and generalization in deep generative models: An empirical study, in: Advances in Neural Information Processing Systems, pp. 10815–10824

work page 2018

[51] [51]

Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Zhu, J.Y., Park, T., Isola, P., Efros, A.A., 2017a. Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

work page

[52] [52]

Toward multimodal image- to-image translation, in: Advances in Neural Information Pro- cessing Systems, pp

Zhu, J.Y., Zhang, R., Pathak, D., Darrell, T., Efros, A.A., Wang, O., Shechtman, E., 2017b. Toward multimodal image- to-image translation, in: Advances in Neural Information Pro- cessing Systems, pp. 465–476

work page

[53] [53]

Ai can be sexist and racistits time to make it fair

Zou, J., Schiebinger, L., 2018. Ai can be sexist and racistits time to make it fair. 13 Appendix Tables 6-10 show the architectures of the content en- coder, style encoder, image decoder and discriminator used in the cross-modal experiment. The used abbreviations are shown in Table 11. Layer Input →Output Kernel, stride, padconv1 [4,128, 128,3]→[4,128, 12...

work page 2018