In recent years, significant progress has been made in face recognition due to the availability of large-scale labeled face datasets. However, since the faces in these datasets usually contain limited degree and types of variation, the models trained on them generalize poorly to more realistic unconstrained face datasets. While collecting labeled faces with larger variations could be helpful, it is practically infeasible due to privacy and labor cost. In comparison, it is easier to acquire a large number of unlabeled faces from different domains which would better represent the testing scenarios in real-world problems. We present an approach to use such unlabeled faces to learn generalizable face representations, which can be viewed as an unsupervised domain generalization framework. Experimental results on unconstrained datasets show that a small amount of unlabeled data with sufficient diversity can (i) lead to an appreciable gain in recognition performance and (ii) outperform the supervised baseline when combined with less than half of the labeled data. Compared with the state-of-the-art face recognition methods, our method further improves their performance on challenging benchmarks, such as IJB-B, IJB-C and IJB-S.
翻译:近些年来,由于存在大规模贴标签的脸数据集,在面对承认方面已经取得重大进展;然而,由于这些数据集的面孔通常具有有限的程度和种类的变异性,经过培训的模型一般化为不太现实的、没有限制的面板数据集;虽然收集贴标签的面孔和更大的变异可能有所帮助,但由于隐私和劳动成本而实际上不可行;相比之下,从不同领域获取大量未贴标签的面孔比较容易,这能更好地代表现实世界问题的测试情景。我们提出了一个方法,用这些未贴标签的面孔来学习可通用的面孔表情,这可被视为一个不受监督的域通用化框架。未贴标签的数据集的实验结果显示,少量具有充分多样性的未贴标签数据能够(一) 显著提高认知性,(二) 与不到一半的标签数据结合时,超出受监督的基准。与最新的脸识别方法相比,我们的方法进一步改进了它们在具有挑战性的基准上的绩效,如IJB、IJB和IJB等。