Face anti-spoofing (FAS) and face forgery detection play vital roles in securing face biometric systems from presentation attacks (PAs) and vicious digital manipulation (e.g., deepfakes). Despite promising performance upon large-scale data and powerful deep models, the generalization problem of existing approaches is still an open issue. Most of recent approaches focus on 1) unimodal visual appearance or physiological (i.e., remote photoplethysmography (rPPG)) cues; and 2) separated feature representation for FAS or face forgery detection. On one side, unimodal appearance and rPPG features are respectively vulnerable to high-fidelity face 3D mask and video replay attacks, inspiring us to design reliable multi-modal fusion mechanisms for generalized face attack detection. On the other side, there are rich common features across FAS and face forgery detection tasks (e.g., periodic rPPG rhythms and vanilla appearance for bonafides), providing solid evidence to design a joint FAS and face forgery detection system in a multi-task learning fashion. In this paper, we establish the first joint face spoofing and forgery detection benchmark using both visual appearance and physiological rPPG cues. To enhance the rPPG periodicity discrimination, we design a two-branch physiological network using both facial spatio-temporal rPPG signal map and its continuous wavelet transformed counterpart as inputs. To mitigate the modality bias and improve the fusion efficacy, we conduct a weighted batch and layer normalization for both appearance and rPPG features before multi-modal fusion. We find that the generalization capacities of both unimodal (appearance or rPPG) and multi-modal (appearance+rPPG) models can be obviously improved via joint training on these two tasks. We hope this new benchmark will facilitate the future research of both FAS and deepfake detection communities.
翻译:尽管大型数据和强大深层模型表现良好,但现有方法的普遍化问题仍是一个未决问题。最近大多数方法侧重于1) 单式视觉或生理(即远程光肿成像仪)提示;以及2) FAS或假造检测的不同特征表示。一方面,单式外观和RPPG特征分别容易受到高性能攻击(例如深假)。一方面,单式外观和RPPG特征分别容易受到高性能攻击(例如深层假冒)。一方面,尽管大规模数据和强力深层模型表现良好,但现有方法的普遍化问题仍是一个未决问题。另一方面,FAS和假造检测任务(例如,远程光肿成像成像成像成像仪)之间有着丰富的共同特征(例如,定期的 RPPG 节奏和对正式模型的出现),为设计联合FASAS和面质检测系统提供可靠的证据,在多式学习时,我们用双式面面面面面面的G IMFPG 测试和双面面面的血压测试,我们用双面的IMBG 测试工具测试,我们用双面的深度模型测试测试测试测试测试,我们用双面的深度测试测试测试,我们用双面的深度测试和制模模型测试工具测试。