Adversarial attacks provide a good way to study the robustness of deep learning models. One category of methods in transfer-based black-box attack utilizes several image transformation operations to improve the transferability of adversarial examples, which is effective, but fails to take the specific characteristic of the input image into consideration. In this work, we propose a novel architecture, called Adaptive Image Transformation Learner (AITL), which incorporates different image transformation operations into a unified framework to further improve the transferability of adversarial examples. Unlike the fixed combinational transformations used in existing works, our elaborately designed transformation learner adaptively selects the most effective combination of image transformations specific to the input image. Extensive experiments on ImageNet demonstrate that our method significantly improves the attack success rates on both normally trained models and defense models under various settings.
翻译:反向攻击为研究深层学习模式的稳健性提供了一个很好的方法。 一种基于传输的黑盒攻击方法使用几种图像转换操作来改进对抗性实例的可转移性,这种方法是有效的,但却没有考虑到输入图像的具体特征。 在这项工作中,我们提出了一个新颖的结构,称为适应性图像转换学习者(AITL),它将不同的图像转换操作纳入一个统一框架,以进一步改善对抗性实例的可转移性。与现有工程采用的固定组合转换不同,我们精心设计的转换学习者通过适应性选择了输入图像特定图像的最有效组合。关于图像网络的广泛实验表明,我们的方法大大提高了通常经过培训的模型和各种环境下的防御模型的进攻性成功率。