Recent deep face recognition models proposed in the literature utilized large-scale public datasets such as MS-Celeb-1M and VGGFace2 for training very deep neural networks, achieving state-of-the-art performance on mainstream benchmarks. Recently, many of these datasets, e.g., MS-Celeb-1M and VGGFace2, are retracted due to credible privacy and ethical concerns. This motivates this work to propose and investigate the feasibility of using a privacy-friendly synthetically generated face dataset to train face recognition models. Towards this end, we utilize a class-conditional generative adversarial network to generate class-labeled synthetic face images, namely SFace. To address the privacy aspect of using such data to train a face recognition model, we provide extensive evaluation experiments on the identity relation between the synthetic dataset and the original authentic dataset used to train the generative model. Our reported evaluation proved that associating an identity of the authentic dataset to one with the same class label in the synthetic dataset is hardly possible. We also propose to train face recognition on our privacy-friendly dataset, SFace, using three different learning strategies, multi-class classification, label-free knowledge transfer, and combined learning of multi-class classification and knowledge transfer. The reported evaluation results on five authentic face benchmarks demonstrated that the privacy-friendly synthetic dataset has high potential to be used for training face recognition models, achieving, for example, a verification accuracy of 91.87\% on LFW using multi-class classification and 99.13\% using the combined learning strategy.
翻译:最近,许多这类数据集,例如MS-Celeb-1M和VGGFace2,由于可靠的隐私和伦理问题而被收回。这促使这项工作提议和调查使用一个方便隐私的合成友好面部数据集来培训面部识别模型的可行性。为此,我们利用一个等级固定对称对称对称网络来制作等级标签的合成脸部图像,即Sface。为了解决使用这些数据来培训面部识别模型的隐私问题,我们提供了大量关于合成数据集与原始真实数据集之间身份关系的评价实验,用于培训基因化模型。我们报告的评价工作证明,将真实数据集的身份与合成数据集中的同一等级标签联系起来是不可能的。我们还提议利用一个保密面部对面识别的典型对称性对立网络来制作等级标签的合成脸部图像,即SFace。为了使用这类数据的隐私识别模型来培训一个保密面部对面部的准确性合成脸部图像图像,SFace, 使用三种不同的实时数据分类学习战略,用一个无隐私面面面面面识别模型,使用多级培训工具,用五个级培训结果, 学习模型,用来进行多种分类。