Advances in photo editing and manipulation tools have made it significantly easier to create fake imagery. Learning to detect such manipulations, however, remains a challenging problem due to the lack of sufficient training data. In this paper, we propose a model that learns to detect visual manipulations from unlabeled data through self-supervision. Given a large collection of real photographs with automatically recorded EXIF metadata, we train a model to determine whether an image is self-consistent --- that is, whether its content could have been produced by a single imaging pipeline. We apply this self-supervised learning method to the task of detecting and localizing image splices. Although the proposed model obtains state-of-the-art performance on several benchmarks, we see it as merely a step in the long quest for a truly general-purpose visual forensics tool.
翻译:照片编辑和操作工具的进步使得制作假图像大为容易。 然而,由于缺乏足够的培训数据,学习检测此类操纵仍然是一个具有挑战性的问题。 在本文中,我们提出了一个模型,通过自我监督从未贴标签的数据中检测视觉操纵。由于大量收集了真实照片并自动记录了EXIF元数据,我们培训了一个模型,以确定图像是否自成一体 -- -- 也就是说,其内容是否由单一成像管道生成。我们用这种自我监督的学习方法来探测和定位图像串点。虽然拟议的模型在几个基准上取得了最新业绩,但我们认为这只是长期寻求真正通用的直观法医学工具的一个步骤。