In this paper, we show that popular Generative Adversarial Networks (GANs) exacerbate biases along the axes of gender and skin tone when given a skewed distribution of face-shots. While practitioners celebrate synthetic data generation using GANs as an economical way to augment data for training data-hungry machine learning models, it is unclear whether they recognize the perils of such techniques when applied to real world datasets biased along latent dimensions. Specifically, we show that (1) traditional GANs further skew the distribution of a dataset consisting of engineering faculty headshots, generating minority modes less often and of worse quality and (2) image-to-image translation (conditional) GANs also exacerbate biases by lightening skin color of non-white faces and transforming female facial features to be masculine when generating faces of engineering professors. Thus, our study is meant to serve as a cautionary tale.
翻译:在本文中,我们展示了流行的Generation Adversarial Networks(GANs)在张脸分布偏斜时,加剧了性别和肤色轴心的偏见。虽然实践者庆祝合成数据生成时使用GANs作为经济经济手段,为数据饥饿机器学习模型培训数据,但不清楚他们是否认识到这些技术在应用于真实世界数据集时的危险性,这些数据集在潜伏维度方面有偏向。具体地说,我们表明:(1)传统GANs进一步扭曲了由工程系教师头部拍摄组成的数据集的分布,造成少数模式的频率和劣质,以及(2)图像到图像翻译(条件性)也加剧了偏见,因为非白色面部的皮肤亮亮,在产生工程教授的面部时将女性面部特征改变为男性。因此,我们的研究旨在作为告诫性的故事。