With the recent growth in computer vision applications, the question of how fair and unbiased they are has yet to be explored. There is abundant evidence that the bias present in training data is reflected in the models, or even amplified. Many previous methods for image dataset de-biasing, including models based on augmenting datasets, are computationally expensive to implement. In this study, we present a fast and effective model to de-bias an image dataset through reconstruction and minimizing the statistical dependence between intended variables. Our architecture includes a U-net to reconstruct images, combined with a pre-trained classifier which penalizes the statistical dependence between target attribute and the protected attribute. We evaluate our proposed model on CelebA dataset, compare the results with a state-of-the-art de-biasing method, and show that the model achieves a promising fairness-accuracy combination.
翻译:随着计算机视觉应用的最近增长,尚未探讨这些应用如何公平和公正的问题。有大量证据表明,培训数据中存在的偏差反映在模型中,甚至放大了。许多先前的图像数据集偏差方法,包括基于增强数据集的模型,在计算上非常昂贵。在本研究中,我们提出了一个快速而有效的模型,通过重建来降低图像数据集的偏差,并最大限度地减少预期变量之间的统计依赖性。我们的架构包括一个用于重建图像的U-net,加上一个经过预先培训的分类器,该分类器惩罚目标属性和受保护属性之间的统计依赖性。我们评估了我们在CelebA数据集上的拟议模型,将结果与最先进的偏差方法进行比较,并表明该模型实现了充满希望的公平-准确性组合。