Visually realistic GAN-generated facial images raise obvious concerns on potential misuse. Many effective forensic algorithms have been developed to detect such synthetic images in recent years. It is significant to assess the vulnerability of such forensic detectors against adversarial attacks. In this paper, we propose a new black-box attack method against GAN-generated image detectors. A novel contrastive learning strategy is adopted to train the encoder-decoder network based anti-forensic model under a contrastive loss function. GAN images and their simulated real counterparts are constructed as positive and negative samples, respectively. Leveraging on the trained attack model, imperceptible contrastive perturbation could be applied to input synthetic images for removing GAN fingerprint to some extent. As such, existing GAN-generated image detectors are expected to be deceived. Extensive experimental results verify that the proposed attack effectively reduces the accuracy of three state-of-the-art detectors on six popular GANs. High visual quality of the attacked images is also achieved. The source code will be available at https://github.com/ZXMMD/BAttGAND.
翻译:视觉上现实的GAN产生的面部图像引起了对潜在滥用的明显关注。近年来,已经开发了许多有效的法医算法,以检测此类合成图像。评估这类法医探测器在对抗性攻击方面的脆弱性非常重要。在本文件中,我们提议对GAN产生的图像探测器采用新的黑盒攻击方法。采用了新的对比式学习战略,在对比式损失功能下,对基于抗反法医模型的编码器-解码器网络进行培训。GAN图像及其模拟真实图像分别作为正和负的样本进行构建。利用经过训练的攻击模型,可在一定程度上将不可察觉的对比性干扰用于输入合成图像以删除GAN指纹。因此,现有的GAN生成图像探测器预计将受到欺骗。广泛的实验结果证实,拟议的攻击有效地降低了6个流行GANs上3个最先进的探测器的准确性。还实现了被攻击图像的高视觉质量。源代码将在https://github.com/ZMMD/BATGAND上查阅。