A variety of effective face-swap and face-reenactment methods have been publicized in recent years, democratizing the face synthesis technology to a great extent. Videos generated as such have come to be called deepfakes with a negative connotation, for various social problems they have caused. Facing the emerging threat of deepfakes, we have built the Korean DeepFake Detection Dataset (KoDF), a large-scale collection of synthesized and real videos focused on Korean subjects. In this paper, we provide a detailed description of methods used to construct the dataset, experimentally show the discrepancy between the distributions of KoDF and existing deepfake detection datasets, and underline the importance of using multiple datasets for real-world generalization. KoDF is publicly available at https://moneybrain-research.github.io/kodf in its entirety (i.e. real clips, synthesized clips, clips with adversarial attack, and metadata).
翻译:近些年来,我们公布了各种有效的面部擦拭和面部再现方法,在很大程度上实现了面部合成技术的民主化,制作的视频因其造成的各种社会问题而被称为具有负面内涵的深度假象。面对深层假相的新威胁,我们建立了韩国深层假象探测数据集(韩国深层假象数据集),这是一个以朝鲜主题为重点的大规模合成和真实视频集集。本文详细描述了用于构建数据集的方法,实验性地展示了韩国数字数字集的分布与现有的深假探测数据集之间的差异,并强调了使用多种数据集实现现实世界普遍化的重要性。 韩国数字集在https://mountbrain-research.github.io/kodf 上全文(即真实剪辑、合成剪辑、配对称攻击的剪辑和元数据)公布。