There are demographic biases in current models used for facial recognition (FR). Our Balanced Faces In the Wild (BFW) dataset serves as a proxy to measure bias across ethnicity and gender subgroups, allowing one to characterize FR performances per subgroup. We show results are non-optimal when a single score threshold determines whether sample pairs are genuine or imposters. Within subgroups, performance often varies significantly from the global average. Thus, claims of specific error rates only hold for populations matching the validation data. We mitigate the imbalanced performances using a novel domain adaptation learning scheme on the facial features extracted using state-of-the-art neural networks. This technique balances performance, but it also boosts the overall performance. A benefit of the proposed is to preserve identity information in facial features while decreasing the demographic information they contain. The removal of demographic knowledge prevents potential future biases from being injected into decision-making. This removal improves privacy since less information is available or inferred about an individual. We explore this qualitatively; we also show quantitatively that subgroup classifiers no longer learn from the features of the proposed domain adaptation scheme. For source code and data descriptions, see https://github.com/visionjo/facerec-bias-bfw.
翻译:目前用于面部识别的模型(FR)中存在人口偏差。 野生的平衡面(BFW)数据集是衡量族裔和性别分组之间偏差的替代物,允许对每个分组的FR性能进行定性。当一个分数阈值决定样本对对对夫妇是否真实或假冒者时,我们显示结果是不理想的。在分组内,业绩往往与全球平均数大不相同。因此,对特定误差率的主张只对匹配鉴定数据的人口有显著差异。我们利用新颖的域域适应学习计划,衡量利用最新神经网络提取的面部特征的不平衡性能。这种技术平衡性能,但也提升了总体性能。提议的一个好处是保持面部特征信息,同时减少其中包含的人口信息。人口知识的消失防止今后向决策注入潜在的偏差。这种删除提高了隐私,因为关于个人的信息较少或推断较少。我们探讨这一质量;我们还从数量上显示分组分类不再从拟议的域适应方案的特征中学习。关于源代码和数据说明,见https://girebth/goth.com。