Face detection is to search all the possible regions for faces in images and locate the faces if there are any. Many applications including face recognition, facial expression recognition, face tracking and head-pose estimation assume that both the location and the size of faces are known in the image. In recent decades, researchers have created many typical and efficient face detectors from the Viola-Jones face detector to current CNN-based ones. However, with the tremendous increase in images and videos with variations in face scale, appearance, expression, occlusion and pose, traditional face detectors are challenged to detect various "in the wild" faces. The emergence of deep learning techniques brought remarkable breakthroughs to face detection along with the price of a considerable increase in computation. This paper introduces representative deep learning-based methods and presents a deep and thorough analysis in terms of accuracy and efficiency. We further compare and discuss the popular and challenging datasets and their evaluation metrics. A comprehensive comparison of several successful deep learning-based face detectors is conducted to uncover their efficiency using two metrics: FLOPs and latency. The paper can guide to choose appropriate face detectors for different applications and also to develop more efficient and accurate detectors.
翻译:许多应用程序包括面部识别、面部表情识别、面部跟踪和头部估计,都假定图像中面孔的位置和大小都是已知的。近几十年来,研究人员从Viola-Jones脸色探测器到目前CNN脸色探测器创造了许多典型和高效的面孔探测器。然而,随着图像和视频的大量增加,面部规模、外观、表情、表达、隔离和面部面部的变异,传统面部探测器在发现各种“野外”面孔时面临挑战。深层学习技术的出现带来了显著的突破,在计算大幅提高价格的同时,也带来了可预见的突破。本文介绍了有代表性的深层次学习方法,并介绍了对准确性和效率的深入和透彻分析。我们进一步比较和讨论流行和具有挑战性的数据集及其评价指标。对一些成功的深层学习面部探测器进行了全面比较,以便用两种测量尺度(FLOPs和latency)来发现其效率。本文可以指导人们选择不同应用的适当面部探测器,并开发更高效和准确的探测器。