With the increased deployment of face recognition systems in our daily lives, face presentation attack detection (PAD) is attracting much attention and playing a key role in securing face recognition systems. Despite the great performance achieved by the hand-crafted and deep-learning-based methods in intra-dataset evaluations, the performance drops when dealing with unseen scenarios. In this work, we propose a dual-stream convolution neural networks (CNNs) framework. One stream adapts four learnable frequency filters to learn features in the frequency domain, which are less influenced by variations in sensors/illuminations. The other stream leverages the RGB images to complement the features of the frequency domain. Moreover, we propose a hierarchical attention module integration to join the information from the two streams at different stages by considering the nature of deep features in different layers of the CNN. The proposed method is evaluated in the intra-dataset and cross-dataset setups, and the results demonstrate that our proposed approach enhances the generalizability in most experimental setups in comparison to state-of-the-art, including the methods designed explicitly for domain adaption/shift problems. We successfully prove the design of our proposed PAD solution in a step-wise ablation study that involves our proposed learnable frequency decomposition, our hierarchical attention module design, and the used loss function. Training codes and pre-trained models are publicly released
翻译:随着我们日常生活中面部识别系统的部署增多,脸部攻击探测(PAD)正在吸引许多关注,并在确保面部识别系统方面发挥着关键作用。尽管在数据集内部评价中,手工制作和深学习的方法取得了巨大绩效,但处理不可见情景的性能下降。在这项工作中,我们提议了一个双流演动神经网络(CNNs)框架。一个流对四个可学习的频率过滤器进行了调整,以学习频率领域的特征,这些特征受感应器/光照的变化影响较小。另一个流利用RGB图像来补充频率域的特征。此外,我们建议一个等级关注模块整合,通过考虑CNN不同层次的深度特征的性质,在不同阶段结合两个流的信息。我们建议的方法在内部数据集和交叉数据集设置中进行评估。结果显示,我们所提议的方法加强了大多数实验性训练领域的可概括性,包括明确设计用于领域调整/变换问题的方法。我们提出的分级关注模块整合了我们拟议采用的升级培训模式的设计,从而成功地证明了我们所拟议的升级式培训模式的设计是公开采用的升级式研究。