The vulnerability against presentation attacks is a crucial problem undermining the wide-deployment of face recognition systems. Though presentation attack detection (PAD) systems try to address this problem, the lack of generalization and robustness continues to be a major concern. Several works have shown that using multi-channel PAD systems could alleviate this vulnerability and result in more robust systems. However, there is a wide selection of channels available for a PAD system such as RGB, Near Infrared, Shortwave Infrared, Depth, and Thermal sensors. Having a lot of sensors increases the cost of the system, and therefore an understanding of the performance of different sensors against a wide variety of attacks is necessary while selecting the modalities. In this work, we perform a comprehensive study to understand the effectiveness of various imaging modalities for PAD. The studies are performed on a multi-channel PAD dataset, collected with 14 different sensing modalities considering a wide range of 2D, 3D, and partial attacks. We used the multi-channel convolutional network-based architecture, which uses pixel-wise binary supervision. The model has been evaluated with different combinations of channels, and different image qualities on a variety of challenging known and unknown attack protocols. The results reveal interesting trends and can act as pointers for sensor selection for safety-critical presentation attack detection systems. The source codes and protocols to reproduce the results are made available publicly making it possible to extend this work to other architectures.
翻译:尽管演示攻击探测(PAD)系统试图解决这一问题,但缺乏一般化和稳健性仍然是一个重大关切问题。一些工作表明,使用多通道PAD系统可以减轻这种脆弱性,并导致更强大的系统。然而,PAD系统有许多可用的渠道,例如RGB、近红外线、短波红外线、深度和热感应器。许多传感器增加了系统的成本,因此,在选择模式时,有必要了解不同传感器对各种攻击的性能。我们开展一项全面研究,了解PAD各种成像模式的有效性。研究是在多通道PAD数据集上进行的,收集了14种不同的感应模式,涉及范围很广的2D、3D和部分攻击。我们使用了多频道网络结构,使用像素二进制监督。模型已经与不同频道的组合进行了评估,为公众选择了不同的成像模型,为公众选择结果提供了未知的系统。为各种攻击记录和感应变的系统提供了不同的图像。