Vision Transformers (ViT) have recently demonstrated exemplary performance on a variety of vision tasks and are being used as an alternative to CNNs. Their design is based on a self-attention mechanism that processes images as a sequence of patches, which is quite different compared to CNNs. Hence it is interesting to study if ViTs are vulnerable to backdoor attacks. Backdoor attacks happen when an attacker poisons a small part of the training data for malicious purposes. The model performance is good on clean test images, but the attacker can manipulate the decision of the model by showing the trigger at test time. To the best of our knowledge, we are the first to show that ViTs are vulnerable to backdoor attacks. We also find an intriguing difference between ViTs and CNNs - interpretation algorithms effectively highlight the trigger on test images for ViTs but not for CNNs. Based on this observation, we propose a test-time image blocking defense for ViTs which reduces the attack success rate by a large margin. Code is available here: https://github.com/UCDvision/backdoor_transformer.git
翻译:视觉变异器(VIT)最近展示了各种视觉任务方面的模范性表现,并被用作CNN的替代物。它们的设计是基于一个自我注意机制,将图像作为补丁序列处理,这与CNN相当不同。因此,研究VIT系统是否易受幕后攻击很有意思。攻击者为恶意目的毒害培训数据中的一小部分时,幕后攻击即发生。模型性能在清洁测试图像上表现良好,但攻击者可以通过在测试时显示触发器来操纵模型的决定。据我们所知,我们是第一个显示VT系统易受幕后攻击的自我注意机制。我们还发现VIT和CNN系统之间存在令人好奇的差别——解释算法有效地强调了VIT系统测试图像的触发点,但不是CNNs。基于这一观察,我们建议为VT系统设置一个测试时间图像屏蔽屏蔽屏障,以大幅降低攻击成功率。这里有代码:https://github.com/UCDVE/backdoor_transfor.gift。