MAAD-脸面:脸部图像大量附加说明的属性数据集 (MAAD-Face: A Massively Annotated Attribute Dataset for Face Images)

Soft-biometrics play an important role in face biometrics and related fields since these might lead to biased performances, threatens the user's privacy, or are valuable for commercial aspects. Current face databases are specifically constructed for the development of face recognition applications. Consequently, these databases contain large amount of face images but lack in the number of attribute annotations and the overall annotation correctness. In this work, we propose MAADFace, a new face annotations database that is characterized by the large number of its high-quality attribute annotations. MAADFace is build on the VGGFace2 database and thus, consists of 3.3M faces of over 9k individuals. Using a novel annotation transfer-pipeline that allows an accurate label-transfer from multiple source-datasets to a target-dataset, MAAD-Face consists of 123.9M attribute annotations of 47 different binary attributes. Consequently, it provides 15 and 137 times more attribute labels than CelebA and LFW. Our investigation on the annotation quality by three human evaluators demonstrated the superiority of the MAAD-Face annotations over existing databases. Additionally, we make use of the large amount of high-quality annotations from MAAD-Face to study the viability of soft-biometrics for recognition, providing insights about which attributes support genuine and imposter decisions. The MAAD-Face annotations dataset is publicly available.

翻译：软生物量度在生物鉴别学和相关领域起着重要作用,因为这可能导致有偏差的性能,威胁到用户的隐私,或对商业方面很有价值。当前面部数据库是专门为开发面部识别应用程序而专门建造的。因此,这些数据库包含大量面部图像,但缺少属性说明的数量和总体注释正确性。在这项工作中,我们建议建立一个新的面部说明数据库,其特征是高质量的属性说明数量众多。MAADFace以VGGFace2数据库为基础,因此由超过9k个人的3.3M面孔组成。使用新颖的注解转移管道,从多个源数据集向目标数据集提供准确的标签转移,MAAD-Face包含47个不同二元属性的123.9M属性说明。因此,它提供了比CeebA和LFW多15至137倍的属性标签。我们由3名人类评价员对说明质量的调查显示MAAD-Face说明优于现有数据库的3.3M面孔。此外,我们使用从多种源数据集向目标数据集的高度识别,我们从MAAD-AD的高度认识数据,这是对MAAD的可辨识分析的可靠解释的可靠数据进行大量的确认。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日