The detection of digital face manipulation in video has attracted extensive attention due to the increased risk to public trust. To counteract the malicious usage of such techniques, deep learning-based deepfake detection methods have been developed and have shown impressive results. However, the performance of these detectors is often evaluated using benchmarks that hardly reflect real-world situations. For example, the impact of various video processing operations on detection accuracy has not been systematically assessed. To address this gap, this paper first analyzes numerous real-world influencing factors and typical video processing operations. Then, a more systematic assessment methodology is proposed, which allows for a quantitative evaluation of a detector's robustness under the influence of different processing operations. Moreover, substantial experiments have been carried out on three popular deepfake detectors, which give detailed analyses on the impact of each operation and bring insights to foster future research.
翻译:摘要:数字人脸操作在视频中的检测受到广泛关注,因为它增加了对公众信任的风险。为了对抗这些技术的恶意使用,基于深度学习的深度伪造检测方法得到了开发,并展现出了惊人的结果。然而,这些检测器的性能通常使用很少反映真实世界情况的基准来评估。例如,各种视频处理操作对检测准确性的影响尚未得到系统的评估。为了解决这一差距,本文首先分析了众多实际影响因素和典型视频处理操作。然后,提出了一种更为系统的评估方法,该方法允许在不同处理操作的影响下定量评估检测器的鲁棒性。此外,对三个流行的深度伪造检测器进行了大量实验,其中详细分析了各个操作的影响,并为未来研究带来了见解。