In this paper, we study performance and fairness on visual and thermal images and expand the assessment to masked synthetic images. Using the SpeakingFace and Thermal-Mask dataset, we propose a process to assess fairness on real images and show how the same process can be applied to synthetic images. The resulting process shows a demographic parity difference of 1.59 for random guessing and increases to 5.0 when the recognition performance increases to a precision and recall rate of 99.99\%. We indicate that inherently biased datasets can deeply impact the fairness of any biometric system. A primary cause of a biased dataset is the class imbalance due to the data collection process. To address imbalanced datasets, the classes with fewer samples can be augmented with synthetic images to generate a more balanced dataset resulting in less bias when training a machine learning system. For biometric-enabled systems, fairness is of critical importance, while the related concept of Equity, Diversity, and Inclusion (EDI) is well suited for the generalization of fairness in biometrics, in this paper, we focus on the 3 most common demographic groups age, gender, and ethnicity.
翻译:在本文中,我们研究了视觉和热图像的性能和公平性,并将评估扩大到蒙面合成图像。我们提议了一个评估真实图像的公平性并展示如何将同一过程应用到合成图像的过程。由此产生的过程显示,当识别性能提高至精确度和回溯率99.999 ⁇ 时,随机猜测和增加的人口均等差异为1.59,并增加到5.0。我们指出,固有的偏向数据集可以对任何生物鉴别系统的公正性产生深刻影响。偏向数据集的一个主要原因是数据收集过程造成的阶级不平衡。为了解决不平衡的数据集,样本较少的类别可以增加合成图像,以便在培训机器学习系统时产生更加平衡的数据集,从而导致更加不那么偏颇。对于生物鉴别系统来说,公平性至关重要,而相关的公平、多样性和包容性概念(EDI)非常适合生物鉴别学的公正性的普及,在本文件中,我们侧重于三个最常见的人口群体的年龄、性别和族裔。