Recent advances in computer vision has led to a growth of interest in deploying visual analytics model on mobile devices. However, most mobile devices have limited computing power, which prohibits them from running large scale visual analytics neural networks. An emerging approach to solve this problem is to offload the computation of these neural networks to computing resources at an edge server. Efficient computation offloading requires optimizing the trade-off between multiple objectives including compressed data rate, analytics performance, and computation speed. In this work, we consider a "split computation" system to offload a part of the computation of the YOLO object detection model. We propose a learnable feature compression approach to compress the intermediate YOLO features with light-weight computation. We train the feature compression and decompression module together with the YOLO model to optimize the object detection accuracy under a rate constraint. Compared to baseline methods that apply either standard image compression or learned image compression at the mobile and perform image decompression and YOLO at the edge, the proposed system achieves higher detection accuracy at the low to medium rate range. Furthermore, the proposed system requires substantially lower computation time on the mobile device with CPU only.
翻译:计算机视野的最近进步导致人们对在移动设备上部署视觉分析模型的兴趣增加。 然而,大多数移动设备都具有有限的计算能力,禁止它们运行大规模视觉分析神经网络。 解决这一问题的一种新兴办法是,卸载这些神经网络的计算,在边缘服务器上计算资源。 高效的卸载要求优化多个目标之间的权衡,包括压缩数据率、分析性能和计算速度。 在这项工作中,我们考虑“分解计算”系统,以卸载YOLO物体探测模型的一部分计算结果。我们建议采用可学习的特性压缩方法,以轻量级计算压缩YOLO中间特征。我们与YOLO模型一起培训特性压缩和减压模块,以便在速控下优化物体探测精度。与在移动边缘应用标准图像压缩或学习图像压缩和进行图像减压的基线方法相比,拟议系统在低至中位范围实现更高的探测精度。此外,拟议系统只需要大大降低移动设备的速度,而仅需要使用CPU。