The rapid progress of photorealistic synthesis techniques has reached at a critical point where the boundary between real and manipulated images starts to blur. Thus, benchmarking and advancing digital forgery analysis have become a pressing issue. However, existing face forgery datasets either have limited diversity or only support coarse-grained analysis. To counter this emerging threat, we construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across four tasks: 1) Image Forgery Classification, including two-way (real / fake), three-way (real / fake with identity-replaced forgery approaches / fake with identity-remained forgery approaches), and n-way (real and 15 respective forgery approaches) classification. 2) Spatial Forgery Localization, which segments the manipulated area of fake images compared to their corresponding source real images. 3) Video Forgery Classification, which re-defines the video-level forgery classification with manipulated frames in random positions. This task is important because attackers in real world are free to manipulate any target frame. and 4) Temporal Forgery Localization, to localize the temporal segments which are manipulated. ForgeryNet is by far the largest publicly available deep face forgery dataset in terms of data-scale (2.9 million images, 221,247 videos), manipulations (7 image-level approaches, 8 video-level approaches), perturbations (36 independent and more mixed perturbations) and annotations (6.3 million classification labels, 2.9 million manipulated area annotations and 221,247 temporal forgery segment labels). We perform extensive benchmarking and studies of existing face forensics methods and obtain several valuable observations.
翻译:光真化合成技术的快速进展已经达到一个临界点,真实图像和被操纵图像之间的界限开始模糊。因此,基准和推进数字伪造分析已成为一个紧迫问题。然而,现有的面部伪造数据集多样性有限,或只是支持粗化分析。为了应对这一新出现的威胁,我们建造了伪造网络数据集,这是一个巨大的面部伪造数据集,在图像和视频级别数据中具有统一的说明,跨越四个任务:(1) 图像伪造分类,包括双向(真实/假)、三向(真实/假冒,身份替换伪造方法/伪造,身份保存伪造方法)和nway(真实和15种有关伪造方法)的分类。(2) 空间伪造数据集本地化,将伪造图像的操纵区域与相应的来源真实图像相较。(3) 视频伪造分类,在图像和视频级别上对视频级别进行重新定义,在随机位置上对视频级别进行操纵,因为真实世界中的攻击者可以自由操纵任何目标框架。 (4) 坦波尔·福里·福里尼·奥里基化,在可获取的纸面图解图解图版(2百万个比例的图像版),在实时图像中,在实时版本中进行。