机器学习后门通过硬件特洛伊木马的方式威胁内部安全 (Evil from Within: Machine Learning Backdoors through Hardware Trojans)

Backdoors pose a serious threat to machine learning, as they can compromise the integrity of security-critical systems, such as self-driving cars. While different defenses have been proposed to address this threat, they all rely on the assumption that the hardware on which the learning models are executed during inference is trusted. In this paper, we challenge this assumption and introduce a backdoor attack that completely resides within a common hardware accelerator for machine learning. Outside of the accelerator, neither the learning model nor the software is manipulated, so that current defenses fail. To make this attack practical, we overcome two challenges: First, as memory on a hardware accelerator is severely limited, we introduce the concept of a minimal backdoor that deviates as little as possible from the original model and is activated by replacing a few model parameters only. Second, we develop a configurable hardware trojan that can be provisioned with the backdoor and performs a replacement only when the specific target model is processed. We demonstrate the practical feasibility of our attack by implanting our hardware trojan into the Xilinx Vitis AI DPU, a commercial machine-learning accelerator. We configure the trojan with a minimal backdoor for a traffic-sign recognition system. The backdoor replaces only 30 (0.069%) model parameters, yet it reliably manipulates the recognition once the input contains a backdoor trigger. Our attack expands the hardware circuit of the accelerator by 0.24% and induces no run-time overhead, rendering a detection hardly possible. Given the complex and highly distributed manufacturing process of current hardware, our work points to a new threat in machine learning that is inaccessible to current security mechanisms and calls for hardware to be manufactured only in fully trusted environments.

翻译：后门威胁严重影响机器学习的安全，比如自动驾驶汽车等安全关键系统。目前已经提出了各种防御方案来应对此类威胁，但都建立在硬件基础安全的前提上。本文对这种前提提出了质疑，提出一种完全嵌入在常用机器学习硬件加速器中的后门攻击。不对软件或学习模型进行修改，因而目前的防御无法应对。为了使攻击更加实际，我们克服了两个问题：第一，硬件加速器上的内存严重受限，我们提出了最小后门的概念，该后门与原始模型差异最小，并通过替换少量模型参数进行激活。第二，我们开发了一种可配置的硬件特洛伊木马，可以搭载后门，并仅在处理特定目标模型时才执行替换。我们通过将这种硬件特洛伊木马植入Xilinx Vitis AI DPU（一种商用机器学习加速器）来演示攻击的实际可行性。我们为交通标志识别系统配置了一个最小后门，在输入包含后门触发器的情况下仅替换了30个（0.069％）模型参数，但可靠地操纵了识别功能。攻击使加速器的硬件电路扩展了0.24％，且没有运行时开销，难以被检测到。由于当前硬件制造过程的复杂性和高度分布，我们的这项工作指出了机器学习面临的新威胁，目前的安全机制难以解决，需要在完全可信的环境中制造硬件。