Face verification aims to distinguish between genuine and imposter pairs of faces, which include the same or different identities, respectively. The performance reported in recent years gives the impression that the task is practically solved. Here, we revisit the problem and argue that existing evaluation datasets were built using two oversimplifying design choices. First, the usual identity selection to form imposter pairs is not challenging enough because, in practice, verification is needed to detect challenging imposters. Second, the underlying demographics of existing datasets are often insufficient to account for the wide diversity of facial characteristics of people from across the world. To mitigate these limitations, we introduce the $FaVCI2D$ dataset. Imposter pairs are challenging because they include visually similar faces selected from a large pool of demographically diversified identities. The dataset also includes metadata related to gender, country and age to facilitate fine-grained analysis of results. $FaVCI2D$ is generated from freely distributable resources. Experiments with state-of-the-art deep models that provide nearly 100\% performance on existing datasets show a significant performance drop for $FaVCI2D$, confirming our starting hypothesis. Equally important, we analyze legal and ethical challenges which appeared in recent years and hindered the development of face analysis research. We introduce a series of design choices which address these challenges and make the dataset constitution and usage more sustainable and fairer. $FaVCI2D$ is available at~\url{https://github.com/AIMultimediaLab/FaVCI2D-Face-Verification-with-Challenging-Imposters-and-Diversified-Demographics}.
翻译:面对面的核查旨在区分真实的和假冒的面孔,其中分别包括相同或不同的身份。近年来报告的绩效给人一个印象,即任务实际上已经解决了。在这里,我们重新审视了问题,并争论现有的评价数据集是使用两个过分简化的设计选择来构建的。首先,通常用来形成假冒的对口的识别选择并不具有足够的挑战性,因为在实践中,需要核查来发现具有挑战性的假冒者。第二,现有数据集的基本人口构成往往不足以说明世界各地人们面部特征的广泛多样性。为了减轻这些限制,我们引入了$FAVCI2D数据集。 假造对口的对口是具有挑战性的,因为它们包括从大量人口多样化的身份库中选择的视觉相似面孔。数据集还包括与性别、国家和年龄有关的元数据,以便利对结果进行精细的分析。 $FAVCI2D元来自自由分配的资源。 与最先进的深度模型进行实验,这些模型提供了近100%的当前数据设置业绩,我们开始了一个重要的IM2 和历史序列分析。