With the thriving of deep learning and the widespread practice of using pre-trained networks, backdoor attacks have become an increasing security threat drawing many research interests in recent years. A third-party model can be poisoned in training to work well in normal conditions but behave maliciously when a trigger pattern appears. However, the existing backdoor attacks are all built on noise perturbation triggers, making them noticeable to humans. In this paper, we instead propose using warping-based triggers. The proposed backdoor outperforms the previous methods in a human inspection test by a wide margin, proving its stealthiness. To make such models undetectable by machine defenders, we propose a novel training mode, called the ``noise mode. The trained networks successfully attack and bypass the state-of-the-art defense methods on standard classification datasets, including MNIST, CIFAR-10, GTSRB, and CelebA. Behavior analyses show that our backdoors are transparent to network inspection, further proving this novel attack mechanism's efficiency.
翻译:随着深层学习的蓬勃发展以及广泛使用训练前网络的做法,后门攻击已成为日益严重的安全威胁,近年来引起了许多研究兴趣。第三方模式在正常条件下运行良好的训练中可能会中毒,但在出现触发模式时会恶意行事。然而,现有的后门攻击都建立在噪音扰动触发器上,使人类注意到这些触发器。在本文中,我们提议使用扭曲触发器。提议的后门攻击在人类检查试验中大大超越了先前的方法,证明了其隐秘性。为了使这种模型不被机器维护者察觉,我们提议了一个新型的培训模式,称为“噪音模式”。经过训练的网络成功地攻击和绕过标准分类数据集中最先进的防御方法,包括MNIST、CIFAR-10、GTSRB和CelebA。BA分析显示我们的后门对网络检查是透明的,进一步证明了这个新型攻击机制的效率。