State-of-the-art deep neural networks (DNNs) have been proven to be vulnerable to adversarial manipulation and backdoor attacks. Backdoored models deviate from expected behavior on inputs with predefined triggers while retaining performance on clean data. Recent works focus on software simulation of backdoor injection during the inference phase by modifying network weights, which we find often unrealistic in practice due to restrictions in hardware. In contrast, in this work for the first time we present an end-to-end backdoor injection attack realized on actual hardware on a classifier model using Rowhammer as the fault injection method. To this end, we first investigate the viability of backdoor injection attacks in real-life deployments of DNNs on hardware and address such practical issues in hardware implementation from a novel optimization perspective. We are motivated by the fact that the vulnerable memory locations are very rare, device-specific, and sparsely distributed. Consequently, we propose a novel network training algorithm based on constrained optimization to achieve a realistic backdoor injection attack in hardware. By modifying parameters uniformly across the convolutional and fully-connected layers as well as optimizing the trigger pattern together, we achieve the state-of-the-art attack performance with fewer bit flips. For instance, our method on a hardware-deployed ResNet-20 model trained on CIFAR-10 achieves over 91% test accuracy and 94% attack success rate by flipping only 10 out of 2.2 million bits.
翻译:最先进的深层神经网络(DNNS)已被证明很容易受到对抗性操纵和幕后攻击。 后门模型偏离了预先定义的触发器投入的预期行为,同时保留了清洁数据的性能。 最近的工作重点是通过修改网络重量来模拟在推论阶段后门注射的软件模拟,我们认为,由于硬件的限制,这种模拟在实践中往往不切实际。与此形成对照的是,在这项工作中,我们首次对使用Rowhammer作为错误注射方法的分类模型上的实际硬件进行端对端对端后门注射攻击。为此,我们首先调查DNNS在硬件上实际部署的后门注射攻击的可行性,并从新的优化角度处理硬件执行中的此类实际问题。我们之所以这样做,是因为脆弱的记忆地点非常罕见,设备特有,而且分布稀少。因此,我们提议基于限制优化的新网络培训算法,只对硬件进行现实的后门注射攻击攻击。 通过统一在同级和完全连接的层次上的参数,以及优化20- 20级的触发性攻击模式的比重,我们用新的优化的硬件执行方式在10 %的硬性攻击率, 我们获得了在10号的硬性攻击率上完成了的硬性攻击率的硬性攻击率。