Visually realistic GAN-generated images have recently emerged as an important misinformation threat. Research has shown that these synthetic images contain forensic traces that are readily identifiable by forensic detectors. Unfortunately, these detectors are built upon neural networks, which are vulnerable to recently developed adversarial attacks. In this paper, we propose a new anti-forensic attack capable of fooling GAN-generated image detectors. Our attack uses an adversarially trained generator to synthesize traces that these detectors associate with real images. Furthermore, we propose a technique to train our attack so that it can achieve transferability, i.e. it can fool unknown CNNs that it was not explicitly trained against. We demonstrate the performance of our attack through an extensive set of experiments, where we show that our attack can fool eight state-of-the-art detection CNNs with synthetic images created using seven different GANs.
翻译:从视觉上看现实的GAN生成的图像最近成为一个重要的错误信息威胁。研究表明这些合成图像含有法医痕迹,这些痕迹很容易通过法医探测器识别。 不幸的是,这些探测器建立在神经网络上,这些网络很容易受到最近开发的对抗性攻击的影响。在本文中,我们提出一种新的反法医攻击,能够愚弄GAN生成的图像探测器。我们的攻击使用了经过对抗训练的生成器,以合成这些探测器与真实图像相联系的痕迹。此外,我们提出了一种技术来训练我们的攻击,以便它能够实现可转移性,即它能够愚弄没有受过明确训练的未知CNN。我们通过一系列广泛的实验来展示我们的攻击性能,我们通过这些实验表明我们的攻击能够愚弄8个最先进的探测CNN,用7种不同的GAN制作的合成图像来欺骗8个最先进的CNNCNN。