This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were evaluated using an in-the-wild dataset based on reannotated IJB-C, further enriched by 12.5K new images and additional labels. The dataset is not balanced, which simulates a real world scenario where AI-based models supposed to present fair outcomes are trained and evaluated on imbalanced data. The challenge attracted 151 participants, who made more than 1.8K submissions in total. The final phase of the challenge attracted 36 active teams out of which 10 exceeded 0.999 AUC-ROC while achieving very low scores in the proposed bias metrics. Common strategies by the participants were face pre-processing, homogenization of data distributions, the use of bias aware loss functions and ensemble models. The analysis of top-10 teams shows higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too.
翻译:这项工作总结了2020 ChaLearn Looking 人民公平面对面认识和分析挑战2020 ChaLearn Looking 2020 ChaLearn Look 2020 ChaLearn Look of People Fair Face 承认和分析挑战,并介绍了顶级解决方案和结果分析。挑战的目的是评估已提交的1:1任务1:1面对核查时有其他令人困惑的属性的1:1任务算法在性别和肤色方面的准确性和偏差。对参与者进行了评价,根据重新注释的IJB-C, 利用了12.5K新图像和额外标签进一步丰富了全局数据集,对参与者进行了评估。数据集不平衡,模拟了真实的世界情景,假定以基于AI的模型提供公平结果,对不平衡的数据进行了培训和评价。挑战吸引了151名参与者,他们提交了超过1.8K的呈件。挑战的最后阶段吸引了36个活跃团队,其中10个团队超过0.99AUC-ROC,在拟议的偏差度指标中得分数非常低。参与者的共同策略是面对预处理、数据分配的同质化、了解损失的偏差的功能和混合模型。对10大队的分析表明,暗色视镜年龄女性的偏差率更高。