Visually realistic GAN-generated images have recently emerged as an important misinformation threat. Research has shown that these synthetic images contain forensic traces that are readily identifiable by forensic detectors. Unfortunately, these detectors are built upon neural networks, which are vulnerable to recently developed adversarial attacks. In this paper, we propose a new anti-forensic attack capable of fooling GAN-generated image detectors. Our attack uses an adversarially trained generator to synthesize traces that these detectors associate with real images. Furthermore, we propose a technique to train our attack so that it can achieve transferability, i.e. it can fool unknown CNNs that it was not explicitly trained against. We evaluate our attack through an extensive set of experiments, where we show that our attack can fool eight state-of-the-art detection CNNs with synthetic images created using seven different GANs, and outperform other alternative attacks.
翻译:从视觉上看现实的GAN生成的图像最近成为一个重要的错误信息威胁。研究表明,这些合成图像包含法证痕迹,这些痕迹很容易通过法医探测器识别。 不幸的是,这些探测器建立在神经网络上,这些网络容易受到最近开发的对抗性攻击的伤害。在本文中,我们提出一种新的反法医攻击,能够愚弄GAN生成的图像探测器。我们的攻击使用了经过对抗训练的生成器,以合成这些探测器与真实图像相联系的痕迹。此外,我们提出了一种技术来训练我们的攻击,以便它能够实现可转移性,也就是说,它可以愚弄没有受过明确训练的未知CNN。我们通过一系列广泛的实验来评估我们的攻击,我们通过这些实验表明,我们的攻击能够用7种不同的GAN生成的合成图像来愚弄8种最先进的CNN,并超越其他其他攻击。