Although adversarial attacks have achieved incredible attack success rates under the white-box setting, most existing adversaries often exhibit weak transferability under the black-box setting. To address this issue, various input transformations have been proposed to enhance the attack transferability. In this work, We observe that all the existing transformations are applied on a single image, which might limit the transferability of the crafted adversaries. Hence, we propose a new input transformation based attack called Admix Attack Method (AAM) that considers both the original image and an image randomly picked from other categories. Instead of directly calculating the gradient on the original input, AAM calculates the gradient on the admixed image interpolated by the two images in order to craft adversaries with higher transferablility. Empirical evaluations on the standard ImageNet dataset demonstrate that AAM could achieve much higher transferability than the existing input transformation methods. By incorporating with other input transformations, our method could further improve the transferability and outperform the state-of-the-art combination of input transformations by a clear margin of 3.4% on average when attacking nine advanced defense models.
翻译:虽然对抗性攻击在白箱设置下取得了令人难以置信的攻击成功率,但大多数现有对手在黑箱设置下往往表现出薄弱的可转移性。为了解决这一问题,提出了各种输入变换建议,以加强攻击可转移性。在这项工作中,我们观察到,所有现有的变换都应用在单一图像上,这可能会限制编造对手的可转移性。因此,我们提议了一个新的基于输入变换的攻击,称为Admix攻击法(AAM),它既考虑原始图像,又考虑从其他类别随机选取的图像。AAM没有直接计算原始输入的梯度,而是计算由两张图像相互混合的图象的梯度,以便用更高的转移性来操纵对手。对标准图像网数据集的实证评估表明,AAM可以实现比现有输入变换方法高得多的可转移性。通过结合其他输入变换,我们的方法可以进一步提高可转移性,并超越从其他类别中随机选取的图像组合。在攻击9个先进的防御模型时,以3.4%的明显平均幅度来计算。