评估计算机愿景中的数据集偏见 (Assessing Dataset Bias in Computer Vision)

A biased dataset is a dataset that generally has attributes with an uneven class distribution. These biases have the tendency to propagate to the models that train on them, often leading to a poor performance in the minority class. In this project, we will explore the extent to which various data augmentation methods alleviate intrinsic biases within the dataset. We will apply several augmentation techniques on a sample of the UTKFace dataset, such as undersampling, geometric transformations, variational autoencoders (VAEs), and generative adversarial networks (GANs). We then trained a classifier for each of the augmented datasets and evaluated their performance on the native test set and on external facial recognition datasets. We have also compared their performance to the state-of-the-art attribute classifier trained on the FairFace dataset. Through experimentation, we were able to find that training the model on StarGAN-generated images led to the best overall performance. We also found that training on geometrically transformed images lead to a similar performance with a much quicker training time. Additionally, the best performing models also exhibit a uniform performance across the classes within each attribute. This signifies that the model was also able to mitigate the biases present in the baseline model that was trained on the original training set. Finally, we were able to show that our model has a better overall performance and consistency on age and ethnicity classification on multiple datasets when compared with the FairFace model. Our final model has an accuracy on the UTKFace test set of 91.75%, 91.30%, and 87.20% for the gender, age, and ethnicity attribute respectively, with a standard deviation of less than 0.1 between the accuracies of the classes of each attribute.

翻译：偏差的数据集是一个通常具有等级分布不均特征的数据集。这些偏差往往会传播到培训它们的模型中, 往往导致少数类的性能差。在此项目中, 我们将探索各种数据增强方法在多大程度上减轻数据集内固有的偏差。我们将对UTKFace数据集样本应用几种增强技术, 例如下取样、几何转换、变式自动转换器( VAE) 和基因化对抗网络( GANs ) 。然后我们训练了每个增强的数据集的分类器, 并评估了它们在本地测试组和外部面部识别数据集上的性能。在这个项目中, 我们将探索各种数据增强方法在多大程度上能缓解数据集内固有的偏差。通过实验, 我们发现StarGAN生成的图像模型培训导致最佳的总体性能。我们还发现, 几何变图像培训导致类似的性能, 培训时间要快得多。此外, 最优秀的模型在本地测试组内部测试组里, 和外部面部的性能的性能性能也比我们所训练的底底级的性更差。显示, 我们最后的性能的性能的性能的性能的性能在每一级中, 显示的性能的性能的性能的性能的性能在每一级中, 显示的性能的性能在原始性能在每一级的性能上, 显示的性能的性能的性能的性能的性能的性能在每一级中, 显示, 显示的性能的性能在每一级的性能在每一级的性能的性能的性能在每一级的性能在每一级的性能的性能的性能的性能的性能的性能的性能上, 显示在每一级的性能的性能在每一级的性能的性能上, 显示在每一级的性能的性能在每一级的性能上,在每一级的性能的性能在每一级的性能在每一级的性能上, 上, 上,在每一级的性能在每一级的性能的性能上,在每一级的性能的性能的性能的性能的性能的性能的性能的性能上都能的性能