State-of-the-art deep neural networks (DNNs) have been proven to be vulnerable to adversarial manipulation and backdoor attacks. Backdoored models deviate from expected behavior on inputs with predefined triggers while retaining performance on clean data. Recent works focus on software simulation of backdoor injection during the inference phase by modifying network weights, which we find often unrealistic in practice due to the hardware restriction such as bit allocation in memory. In contrast, in this work, we investigate the viability of backdoor injection attacks in real-life deployments of DNNs on hardware and address such practical issues in hardware implementation from a novel optimization perspective. We are motivated by the fact that the vulnerable memory locations are very rare, device-specific, and sparsely distributed. Consequently, we propose a novel network training algorithm based on constrained optimization for realistic backdoor injection attack in hardware. By modifying parameters uniformly across the convolutional and fully-connected layers as well as optimizing the trigger pattern together, we achieve the state-of-the-art attack performance with fewer bit flips. For instance, our method on a hardware-deployed ResNet-20 model trained on CIFAR-10 can achieve over 91% test accuracy and 94% attack success rate by flipping only 10 bits out of 2.2 million bits.
翻译:事实证明,最先进的深层神经网络(DNN)很容易受到对抗性操纵和后门攻击。 后门模型偏离了预先定义的触发器投入的预期行为,同时保留了清洁数据的性能。 最近的工作重点是在推论阶段通过修改网络重量进行后门注射的软件模拟,我们发现,由于诸如记忆中的比分等硬件限制,这种模拟在实践中常常不切实际。与此形成对照的是,我们调查了在硬件上实际部署DNNT时后门注射攻击的可行性,并从新颖的优化角度解决了硬件实施过程中的这类实际问题。 我们的动机是,脆弱的记忆地点非常罕见,设备专用,而且分布不广。 因此,我们提议基于对现实的后门注射硬件攻击进行限制优化的新型网络培训算法。 通过统一在动态和完全连接层之间的参数以及优化触发模式,我们用较少的点反光来完成对硬件配置的进攻性攻击性表现。 例如,我们使用硬件配置的Res-20型模型的方法, 仅通过硬性攻击率为94-10的精确度测试超过91。