Recently, deep neural networks (DNNs) have been deployed in safety-critical systems such as autonomous vehicles and medical devices. Shortly after that, the vulnerability of DNNs were revealed by stealthy adversarial examples where crafted inputs -- by adding tiny perturbations to original inputs -- can lead a DNN to generate misclassification outputs. To improve the robustness of DNNs, some algorithmic-based countermeasures against adversarial examples have been introduced thereafter. In this paper, we propose a new type of stealthy attack on protected DNNs to circumvent the algorithmic defenses: via smart bit flipping in DNN weights, we can reserve the classification accuracy for clean inputs but misclassify crafted inputs even with algorithmic countermeasures. To fool protected DNNs in a stealthy way, we introduce a novel method to efficiently find their most vulnerable weights and flip those bits in hardware. Experimental results show that we can successfully apply our stealthy attack against state-of-the-art algorithmic-protected DNNs.
翻译:最近,在诸如自主车辆和医疗装置等安全临界系统中部署了深层神经网络(DNNs)。随后不久,通过隐性对抗性实例暴露了DNS的脆弱性,这些隐性对抗性实例显示,通过在原始输入中添加微小扰动,编造输入能够导致DNN产生错误分类产出。为了提高DNNs的稳健性,此后引入了针对对抗敌对的例子的一些算法性对策。在本文中,我们提议对受保护的DNS进行新型的隐性攻击,以绕过算法防御:通过在DNN重量中智能点翻转,我们可以保留清洁输入的分类准确性,但即使用算法反措施也将伪造输入错误分类。为了以隐性方式愚弄保护DNNN,我们引入了一种新的方法,以便有效地发现他们最脆弱的重量,并在硬件中翻转这些位子。实验结果显示,我们可以成功地对受保护的DNNNPS进行隐性攻击。