利用演化作为内部防御防御手段,防止反向机器学习攻击 (Using Undervolting as an On-Device Defense Against Adversarial Machine Learning Attacks)

Deep neural network (DNN) classifiers are powerful tools that drive a broad spectrum of important applications, from image recognition to autonomous vehicles. Unfortunately, DNNs are known to be vulnerable to adversarial attacks that affect virtually all state-of-the-art models. These attacks make small imperceptible modifications to inputs that are sufficient to induce the DNNs to produce the wrong classification. In this paper we propose a novel, lightweight adversarial correction and/or detection mechanism for image classifiers that relies on undervolting (running a chip at a voltage that is slightly below its safe margin). We propose using controlled undervolting of the chip running the inference process in order to introduce a limited number of compute errors. We show that these errors disrupt the adversarial input in a way that can be used either to correct the classification or detect the input as adversarial. We evaluate the proposed solution in an FPGA design and through software simulation. We evaluate 10 attacks on two popular DNNs and show an average detection rate of 80% to 95%.

翻译：深神经网络(DNN) 分类器是驱动从图像识别到自主车辆等广泛重要应用的强大工具。不幸的是, DNN 已知很容易受到对抗性攻击,这些攻击几乎影响到所有最先进的模型。这些攻击对投入进行了小的无法察觉的修改,足以诱使 DNN 产生错误的分类。在本文中,我们提议对依赖低演的图像分类器采用新的、轻量的对抗性校正和/或检测机制(在电压下运行一个芯片,略低于其安全边缘 ) 。我们提议使用受控的芯片低演,运行推论过程,以引入数量有限的折算错误。我们表明这些错误扰乱了对抗性输入, 其方式可以用来纠正分类或检测为对抗性输入。我们用FGA设计和软件模拟来评估拟议解决方案。我们评估了对两个流行的DNN的10次攻击案, 并显示平均检测率为80%至95%。

相关内容

Machine Learning

关注 2240

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/