There are demographic biases in the SOTA CNN used for FR. Our BFW dataset serves as a proxy to measure bias across ethnicity and gender subgroups, allowing us to characterize FR performances per subgroup. We show performances are non-optimal when a single score threshold is used to determine whether sample pairs are genuine or imposter. Furthermore, actual performance ratings vary greatly from the reported across subgroups. Thus, claims of specific error rates only hold true for populations matching that of the validation data. We mitigate the imbalanced performances using a novel domain adaptation learning scheme on the facial encodings extracted using SOTA deep nets. Not only does this technique balance performance, but it also boosts the overall performance. A benefit of the proposed is to preserve identity information in facial features while removing demographic knowledge in the lower dimensional features. The removal of demographic knowledge prevents future potential biases from being injected into decision-making. Additionally, privacy concerns are satisfied by this removal. We explore why this works qualitatively with hard samples. We also show quantitatively that subgroup classifiers can no longer learn from the encodings mapped by the proposed.
翻译:用于 FR 的 SOTA CNN 中存在人口偏差。 我们的 BFW 数据集作为衡量种族和性别分组之间偏差的代名词, 允许我们给 FR 每个分组的性能定性。 我们显示,当使用单一分数阈值来确定样本对等是否真实或冒牌时, 表现是不理想的。 此外, 各分组报告的实际业绩评级差异很大。 因此, 具体误差率的声称只对与验证数据相匹配的人口来说是真实的。 我们利用使用使用SOTA 深网提取的面部编码的新版域适应学习计划来减少不平衡的性能。 不仅这种技术平衡性能, 而且还能提升整体性能。 提议的优点是保存面部特征的身份信息,同时消除低维面特征的人口知识。 人口知识的消失防止今后可能偏差被注入决策中。 此外, 隐私问题也因这种删除而得到满足。 我们探索为什么这个方法能用硬的样本进行定性。 我们还从数量上表明, 分组分类人员无法再从所绘制的编码中学习。