As realistic facial manipulation technologies have achieved remarkable progress, social concerns about potential malicious abuse of these technologies bring out an emerging research topic of face forgery detection. However, it is extremely challenging since recent advances are able to forge faces beyond the perception ability of human eyes, especially in compressed images and videos. We find that mining forgery patterns with the awareness of frequency could be a cure, as frequency provides a complementary viewpoint where either subtle forgery artifacts or compression errors could be well described. To introduce frequency into the face forgery detection, we propose a novel Frequency in Face Forgery Network (F3-Net), taking advantages of two different but complementary frequency-aware clues, 1) frequency-aware decomposed image components, and 2) local frequency statistics, to deeply mine the forgery patterns via our two-stream collaborative learning framework. We apply DCT as the applied frequency-domain transformation. Through comprehensive studies, we show that the proposed F3-Net significantly outperforms competing state-of-the-art methods on all compression qualities in the challenging FaceForensics++ dataset, especially wins a big lead upon low-quality media.
翻译:由于现实的面部操纵技术取得了显著进展,社会对于可能恶意滥用这些技术的关切带来了一个新的研究课题,即面部伪造检测。然而,由于最近的进展能够超越人类眼睛的感知能力,特别是在压缩图像和视频中,因此具有极具挑战性。我们发现,了解频率的采矿伪造模式可以是一种治疗方法,因为频率提供了一种互补观点,可以很好地描述微妙的伪造文物或压缩错误。为了在面部伪造检测中引入频率,我们提议在面部伪造网络(F3-Net)中采用新的频率,利用两种不同但相互补充的频率识别线索的优势,即:1)频率分解图像元件和2)本地频率统计数据,通过我们双流合作学习框架来深入挖掘伪造模式。我们应用DCT作为应用的频率-持续变换。我们通过全面研究显示,拟议的F3-Net大大超越了挑战性FaceForesics+数据集中所有压缩质量的州级方法,特别是低质量媒体的领先者。