DeepFake technology has advanced significantly in recent years, enabling the creation of highly realistic synthetic face images. Existing DeepFake detection methods often struggle with pose variations, occlusions, and artifacts that are difficult to detect in real-world conditions. To address these challenges, we propose a multi-view architecture that enhances DeepFake detection by analyzing facial features at multiple levels. Our approach integrates three specialized encoders, a global view encoder for detecting boundary inconsistencies, a middle view encoder for analyzing texture and color alignment, and a local view encoder for capturing distortions in expressive facial regions such as the eyes, nose, and mouth, where DeepFake artifacts frequently occur. Additionally, we incorporate a face orientation encoder, trained to classify face poses, ensuring robust detection across various viewing angles. By fusing features from these encoders, our model achieves superior performance in detecting manipulated images, even under challenging pose and lighting conditions.Experimental results on challenging datasets demonstrate the effectiveness of our method, outperforming conventional single-view approaches
翻译:近年来,深度伪造技术取得了显著进展,能够生成高度逼真的合成人脸图像。现有的深度伪造检测方法往往难以应对现实场景中的姿态变化、遮挡以及难以察觉的伪影。为解决这些挑战,我们提出了一种多视角架构,通过多层面分析面部特征来增强深度伪造检测能力。该方法集成了三个专用编码器:用于检测边界不一致性的全局视角编码器、用于分析纹理与颜色对齐的中观视角编码器,以及用于捕捉眼部、鼻部、嘴部等表情丰富区域(这些区域常出现深度伪造伪影)局部畸变的微观视角编码器。此外,我们引入了一个经过训练以分类人脸姿态的面部朝向编码器,确保模型在不同观察角度下均能实现鲁棒检测。通过融合这些编码器的特征,我们的模型在检测篡改图像方面取得了卓越性能,即使在具有挑战性的姿态与光照条件下亦然。在多个高难度数据集上的实验结果表明,本方法显著优于传统的单视角检测方案。