The remarkable success in face forgery techniques has received considerable attention in computer vision due to security concerns. We observe that up-sampling is a necessary step of most face forgery techniques, and cumulative up-sampling will result in obvious changes in the frequency domain, especially in the phase spectrum. According to the property of natural images, the phase spectrum preserves abundant frequency components that provide extra information and complement the loss of the amplitude spectrum. To this end, we present a novel Spatial-Phase Shallow Learning (SPSL) method, which combines spatial image and phase spectrum to capture the up-sampling artifacts of face forgery to improve the transferability, for face forgery detection. And we also theoretically analyze the validity of utilizing the phase spectrum. Moreover, we notice that local texture information is more crucial than high-level semantic information for the face forgery detection task. So we reduce the receptive fields by shallowing the network to suppress high-level features and focus on the local region. Extensive experiments show that SPSL can achieve the state-of-the-art performance on cross-datasets evaluation as well as multi-class classification and obtain comparable results on single dataset evaluation.
翻译:由于安全考虑,在计算机视野中,表面伪造技术的显著成功得到了相当程度的注意。我们认为,扩大抽样是大多数表面伪造技术的必要步骤,累积抽样将导致频率域的明显变化,特别是在相片频谱中。根据自然图像的特性,相片保留了提供额外信息并补充振幅光谱损失的丰富频谱组成部分。为此,我们提出了一个新的空间-阶段Shall Learning(SPSL)方法,将空间图像和相频谱结合起来,以捕捉表面伪造的上层文物,提高可转移性,并进行表面伪造检测。我们还从理论上分析了利用相片频谱的有效性。此外,我们注意到,地方质谱信息比高级结构信息对表面伪造检测任务更为关键。因此,我们缩小了可接收领域,将网络的深度抑制高特征并侧重于当地区域。广泛的实验显示,SPSLE能够实现交叉数据集评价以及多级分类的状态,并获得可比较的单一数据评价结果。