Deep Neural Network (DNN) is becoming adopted for video analytics on mobile devices. To reduce the delay of running DNNs, many mobile devices are equipped with Neural Processing Units (NPU). However, due to the resource limitations of NPU, these DNNs have to be compressed to increase the processing speed at the cost of accuracy. To address the low accuracy problem, we propose a Confidence Based Offloading (CBO) framework for deep learning video analytics. The major challenge is to determine when to return the NPU classification result based on the confidence level of running the DNN, and when to offload the video frames to the server for further processing to increase the accuracy. We first identify the problem of using existing confidence scores to make offloading decisions, and propose confidence score calibration techniques to improve the performance. Then, we formulate the CBO problem where the goal is to maximize accuracy under some time constraint, and propose an adaptive solution that determines which frames to offload at what resolution based on the confidence score and the network condition. Through real implementations and extensive evaluations, we demonstrate that the proposed solution can significantly outperform other approaches.
翻译:为了减少运行 DNN的延迟,许多移动设备配备了神经处理器(NPU)。然而,由于NPU的资源限制,这些DNN必须压缩,以便以精确度为代价提高处理速度。为了解决低精度问题,我们提议了一个基于信任的卸载(CBO)框架,用于深入学习视频分析。主要的挑战是如何确定何时根据运行 DNN的可靠性水平返回 NPU 分类结果,以及何时卸载视频框架到服务器上以便进一步处理,以提高准确性。我们首先确定利用现有的信任评分来卸载决定的问题,然后提出改进性能的信任评分技术。然后,我们提出CBO问题,目的是在一定时间限制下实现最大精度,并提出适应性解决方案,根据信任评分和网络条件确定哪些框架可以卸载在什么分辨率上。我们通过实际执行和广泛的评估,证明拟议的解决方案可以大大超过其他方法。