Partitioning CNN model computation between edge devices and servers has been proposed to alleviate edge devices' computing capability and network transmission limitations. However, due to the large data size of the intermediate output in CNN models, the transmission latency is still the bottleneck for such partition-offloading. Though compression methods on images like JPEG-based compression can be applied to the intermediate output data in CNN models, their compression rates are limited, and the compression leads to high accuracy loss. Other compression methods for partition-offloading adopt deep learning technology and require hours of additional training. In this paper, we propose a novel compression method DISC for intermediate output data in CNN models. DISC can be applied to partition-offloading systems in a plug-and-play way without any additional training. It shows higher performance on intermediate output data compression than the other compression methods designed for image compression. Further, AGLOP is developed to optimize the partition-offloading system by adjusting the partition point and the hyper-parameters of DISC. Based on our evaluation, DISC can achieve over 98% data size reduction with less than $1\%$ accuracy loss, and AGLOP can achieve over 91.2% end-to-end execution latency reduction compared with the original partition-offloading.
翻译:在边缘装置和服务器之间分割CNN模型计算边缘装置和服务器,以缓解边缘装置的计算能力和网络传输限制。然而,由于CNN模型中中间输出输出的数据规模巨大,传输延迟度仍然是这种分区卸载的瓶颈。虽然对CNN模型中中间输出数据数据可以应用JPEG基础压缩等图像的压缩方法,但其压缩率有限,压缩导致高精度损失。其他分区卸载压缩方法采用深层学习技术,并需要额外培训数小时。在本文件中,我们建议对CNN模型中的中间输出数据采用新型压缩 DISC方法。DISC可以在不进行任何额外培训的情况下,以插接和播放方式应用于分区卸载系统。它显示中间输出数据压缩的性能高于为图像压缩设计的其他压缩方法。此外,GLOP正在开发AGLOP,通过调整分区点和高分光度DISC来优化分区卸载系统。根据我们的评估,DISC可以实现超过98%的数据缩略度减少,低于1美元的精确度损失。