The rapid advancement of generative artificial intelligence has enabled the creation of highly realistic fake facial images, posing serious threats to personal privacy and the integrity of online information. Existing deepfake detection methods often rely on handcrafted forensic cues and complex architectures, achieving strong performance in intra-domain settings but suffering significant degradation when confronted with unseen forgery patterns. In this paper, we propose GenDF, a simple yet effective framework that transfers a powerful large-scale vision model to the deepfake detection task with a compact and neat network design. GenDF incorporates deepfake-specific representation learning to capture discriminative patterns between real and fake facial images, feature space redistribution to mitigate distribution mismatch, and a classification-invariant feature augmentation strategy to enhance generalization without introducing additional trainable parameters. Extensive experiments demonstrate that GenDF achieves state-of-the-art generalization performance in cross-domain and cross-manipulation settings while requiring only 0.28M trainable parameters, validating the effectiveness and efficiency of the proposed framework.
翻译:生成式人工智能的快速发展使得创建高度逼真的伪造人脸图像成为可能,这对个人隐私和网络信息完整性构成了严重威胁。现有的深度伪造检测方法通常依赖于手工设计的取证线索和复杂架构,在域内设置中表现出色,但在面对未见过的伪造模式时性能显著下降。本文提出GenDF,一个简单而有效的框架,通过紧凑简洁的网络设计,将强大的大规模视觉模型迁移至深度伪造检测任务。GenDF融合了深度伪造特异性表征学习以捕获真实与伪造人脸图像间的判别性模式,特征空间重分布以缓解分布失配问题,以及一种分类不变的特征增强策略,在不引入额外可训练参数的情况下提升泛化能力。大量实验表明,GenDF在跨域和跨操作设置中实现了最先进的泛化性能,同时仅需0.28M可训练参数,验证了所提框架的有效性和高效性。