In recent years, convolutional neural networks (CNNs) have been widely used by researchers to perform forensic tasks such as image tampering detection. At the same time, adversarial attacks have been developed that are capable of fooling CNN-based classifiers. Understanding the transferability of adversarial attacks, i.e. an attacks ability to attack a different CNN than the one it was trained against, has important implications for designing CNNs that are resistant to attacks. While attacks on object recognition CNNs are believed to be transferrable, recent work by Barni et al. has shown that attacks on forensic CNNs have difficulty transferring to other CNN architectures or CNNs trained using different datasets. In this paper, we demonstrate that adversarial attacks on forensic CNNs are even less transferrable than previously thought even between virtually identical CNN architectures! We show that several common adversarial attacks against CNNs trained to identify image manipulation fail to transfer to CNNs whose only difference is in the class definitions (i.e. the same CNN architectures trained using the same data). We note that all formulations of class definitions contain the unaltered class. This has important implications for the future design of forensic CNNs that are robust to adversarial and anti-forensic attacks.
翻译:近些年来,研究人员广泛利用革命神经网络(CNNs)来从事诸如图象篡改探测等法医任务。与此同时,还开发了能够愚弄CNN分类的对抗性攻击。了解对抗性攻击的可转移性,即攻击与所训练的有线电视新闻网不同的攻击能力,对设计有抵抗力的有抵抗力的有线电视新闻网具有重要影响。虽然对目标识别有线电视网的攻击据信是可以转移的,但Barni等人最近的工作表明,对有线电视新闻网的攻击很难转移到其他有线电视新闻网结构或利用不同数据集受过训练的有线电视新闻网。在本文中,我们证明对有线电视新闻网的对抗性攻击比以前想象的几乎完全相同的CNN结构更难转移。我们表明,对受过训练的有识别图像操纵的有线电视新闻网进行的一些常见的对抗性攻击未能转移到在等级定义上唯一不同之处的有区别的CNNNPNS(即用同一数据训练的CNN结构)。我们注意到,所有类别定义的配制中都含有反向有防和对抗性攻击的类别。这对未来的CNNCNN有重要影响。