Recent advances in attention-based networks have shown that Vision Transformers can achieve state-of-the-art or near state-of-the-art results on many image classification tasks. This puts transformers in the unique position of being a promising alternative to traditional convolutional neural networks (CNNs). While CNNs have been carefully studied with respect to adversarial attacks, the same cannot be said of Vision Transformers. In this paper, we study the robustness of Vision Transformers to adversarial examples. Our analyses of transformer security is divided into three parts. First, we test the transformer under standard white-box and black-box attacks. Second, we study the transferability of adversarial examples between CNNs and transformers. We show that adversarial examples do not readily transfer between CNNs and transformers. Based on this finding, we analyze the security of a simple ensemble defense of CNNs and transformers. By creating a new attack, the self-attention blended gradient attack, we show that such an ensemble is not secure under a white-box adversary. However, under a black-box adversary, we show that an ensemble can achieve unprecedented robustness without sacrificing clean accuracy. Our analysis for this work is done using six types of white-box attacks and two types of black-box attacks. Our study encompasses multiple Vision Transformers, Big Transfer Models and CNN architectures trained on CIFAR-10, CIFAR-100 and ImageNet.
翻译:关注网络最近的进展表明,视觉变异器可以在许多图像分类任务上达到最新或接近最新的结果。这让变异器处于独特的地位,成为传统神经神经神经网络(CNNs)的一个有希望的替代物。虽然CNN对对抗性攻击进行了仔细研究,但对于视觉变异器来说,情况并非如此。在本文中,我们研究了视觉变异器对对抗性例子的强大性。我们对变异器安全的分析分为三个部分。首先,我们在标准白箱和黑箱攻击下测试变异器。第二,我们研究了CNN和变异器之间对抗性例子的可转移性。我们展示了CNN和变异器之间的对抗性例子不易转移。基于这一发现,我们分析了CNN和变异器简单联式防御的安全性。通过制造新的攻击,自我意识混合变异器攻击,我们显示这种变异变器在白箱敌人下是不安全的。然而,在黑箱对CNN和变异器之间的对抗性例子中,我们展示了对抗性对抗性例子,我们所训练的网络变异性攻击的模型和变异变型BRRBR的精确性分析可以实现。我们这个最强的模型和最强的变异式的变形攻击。