DNNs are known to be vulnerable to so-called adversarial attacks that manipulate inputs to cause incorrect results that can be beneficial to an attacker or damaging to the victim. Recent works have proposed approximate computation as a defense mechanism against machine learning attacks. We show that these approaches, while successful for a range of inputs, are insufficient to address stronger, high-confidence adversarial attacks. To address this, we propose DNNSHIELD, a hardware-accelerated defense that adapts the strength of the response to the confidence of the adversarial input. Our approach relies on dynamic and random sparsification of the DNN model to achieve inference approximation efficiently and with fine-grain control over the approximation error. DNNSHIELD uses the output distribution characteristics of sparsified inference compared to a dense reference to detect adversarial inputs. We show an adversarial detection rate of 86% when applied to VGG16 and 88% when applied to ResNet50, which exceeds the detection rate of the state of the art approaches, with a much lower overhead. We demonstrate a software/hardware-accelerated FPGA prototype, which reduces the performance impact of DNNSHIELD relative to software-only CPU and GPU implementations.
翻译:众所周知,DNNSHIELD很容易受到所谓的对抗性攻击的伤害,这种攻击操纵了投入,造成对攻击者有利或对受害者造成损害的不正确结果。最近的工作提议了近似计算,作为抵御机器学习攻击的防御机制。我们表明,这些方法虽然在一系列投入方面是成功的,但不足以应对更强大、高度自信的对抗性攻击。为了解决这个问题,我们提议DNNSSHIELD, 这是一种硬件加速防御,能够适应对对抗性投入的信心。我们的方法依赖于DNNN模型的动态和随机透析,以达到感知近似效率,并且对近似错误进行细微控制。DNNSHIELD使用与大量引用检测对抗性投入时相比的松散性推理输出特性。我们显示了在VGG16应用时为86%的对抗性探测率和在ResNet50应用时为88 %的探测率,这超过了艺术方法的检测率,其管理率要低得多。我们展示了软件/硬软件节节节节节节节的CFPLD相对于GPOLD原型模型,从而降低了GNS的性能。