Face manipulation technology is advancing very rapidly, and new methods are being proposed day by day. The aim of this work is to propose a deepfake detector that can cope with the wide variety of manipulation methods and scenarios encountered in the real world. Our key insight is that each person has specific biometric characteristics that a synthetic generator cannot likely reproduce. Accordingly, we extract high-level audio-visual biometric features which characterize the identity of a person, and use them to create a person-of-interest (POI) deepfake detector. We leverage a contrastive learning paradigm to learn the moving-face and audio segment embeddings that are most discriminative for each identity. As a result, when the video and/or audio of a person is manipulated, its representation in the embedding space becomes inconsistent with the real identity, allowing reliable detection. Training is carried out exclusively on real talking-face videos, thus the detector does not depend on any specific manipulation method and yields the highest generalization ability. In addition, our method can detect both single-modality (audio-only, video-only) and multi-modality (audio-video) attacks, and is robust to low-quality or corrupted videos by building only on high-level semantic features. Experiments on a wide variety of datasets confirm that our method ensures a SOTA performance, with an average improvement in terms of AUC of around 3%, 10%, and 4% for high-quality, low quality, and attacked videos, respectively. https://github.com/grip-unina/poi-forensics
翻译:脸部操纵技术正在迅速进步, 并且每天都在提议新方法。 这项工作的目的是提出一个能应对现实世界中各种操纵方法和情景的深假探测器。 我们的关键洞察力是, 每个人有合成生成器可能无法复制的特定生物鉴别特征。 因此, 我们提取高层次的视听生物鉴别特征, 以描述一个人的身份特征, 并使用这些特征来创建个人( POI) 深假探测器。 我们利用一个对比式学习模式来学习对每个身份最具有歧视性的移动- 脸和音频部分嵌入。 因此, 当一个人的视频和/ 或音频被操纵时, 每个人在嵌入空间中的代表性都与真实身份不一致, 允许可靠的检测。 培训完全依靠真实的语音视频进行, 因此检测器并不依赖于任何特定的操作方法, 并产生最高的一般化能力。 此外, 我们的方法可以检测单式( 单式、 视频- ) 和多式( 低质量) 嵌入式) 嵌入式的视频 和多式的( 低质量- 视频- 高质量- 高质量/ 高等级的视频- ) 的视频/ 测试性 测试工具, 以高质量/ 高质量/ 运行数据 只能稳定、 高等级/ 数据 高等级/ 数据 高等级/ 。 以高等级/ 测试 数据 高等级/ 高等级/ 数据 高等级/ 。