Convolutional neural networks (CNNs) achieve remarkable performance in a wide range of fields. However, intensive memory access of activations introduces considerable energy consumption, impeding deployment of CNNs on resourceconstrained edge devices. Existing works in activation compression propose to transform feature maps for higher compressibility, thus enabling dimension reduction. Nevertheless, in the case of aggressive dimension reduction, these methods lead to severe accuracy drop. To improve the trade-off between classification accuracy and compression ratio, we propose a compression-aware projection system, which employs a learnable projection to compensate for the reconstruction loss. In addition, a greedy selection metric is introduced to optimize the layer-wise compression ratio allocation by considering both accuracy and #bits reduction simultaneously. Our test results show that the proposed methods effectively reduce 2.91x~5.97x memory access with negligible accuracy drop on MobileNetV2/ResNet18/VGG16.
翻译:电动神经网络(CNNs)在广泛的领域取得了显著的成绩。然而,激活的密集记忆访问引入了大量的能源消耗,阻碍了在资源紧缺的边缘装置上部署CNN。现有的激活压缩工程提议转换地貌图,以提高压缩力,从而降低尺寸。然而,在大力缩小尺寸的情况下,这些方法导致严重精确度下降。为了改进分类精确度与压缩比率之间的权衡,我们提议采用压缩-觉测预测系统,采用可学习的预测来补偿重建损失。此外,引入了贪婪选择指标,通过同时考虑精确度和降低#比特,优化层-节压率分配。我们的测试结果表明,拟议方法有效地减少了2.91x~5.97x记忆访问量,而移动网络2/ResNet18/VGG16的精确度下降幅度微乎其微。