Nowadays, the deployment of deep learning based applications on edge devices is an essential task owing to the increasing demands on intelligent services. However, the limited computing resources on edge nodes make the models vulnerable to attacks, such that the predictions made by models are unreliable. In this paper, we investigate latency attacks on deep learning applications. Unlike common adversarial attacks for misclassification, the goal of latency attacks is to increase the inference time, which may stop applications from responding to the requests within a reasonable time. This kind of attack is ubiquitous for various applications, and we use object detection to demonstrate how such kind of attacks work. We also design a framework named Overload to generate latency attacks at scale. Our method is based on a newly formulated optimization problem and a novel technique, called spatial attention, to increase the inference time of object detection. We have conducted experiments using YOLOv5 models on Nvidia NX. The experimental results show that with latency attacks, the inference time of a single image can be increased ten times longer in reference to the normal setting. Moreover, comparing to existing methods, our attacking method is simpler and more effective.
翻译:如今,基于深度学习的应用部署到边缘设备是一项重要任务,这是由于智能服务的需求逐渐增长。然而,边缘节点上有限的计算资源使得模型更容易受到攻击,从而使得模型的预测结果不可靠。本文探讨了深度学习应用中的延迟攻击。与常见的误分类对抗攻击不同,延迟攻击的目标是增加推理时间,这可能导致应用不能在合理的时间内响应请求,从而增加了应用的风险。这种攻击对于各种应用都是普遍存在的,我们以物体检测为例说明了这种攻击方式的工作原理。我们还设计了一个名为 Overload 的框架,用于大规模生成延迟攻击。我们的方法是基于一个新的优化问题和一种称为空间注意力的新技术,用于增加物体检测的推理时间。我们在 Nvidia NX上使用 YOLOv5 模型进行了实验。实验结果显示,使用延迟攻击,单张图像的推理时间可以增加到正常设置的十倍。此外,与现有方法相比,我们的攻击方法更简单且更有效。