We study the neural network (NN) compression problem, viewing the tension between the compression ratio and NN performance through the lens of rate-distortion theory. We choose a distortion metric that reflects the effect of NN compression on the model output and then derive the tradeoff between rate (compression ratio) and distortion. In addition to characterizing theoretical limits of NN compression, this formulation shows that \emph{pruning}, implicitly or explicitly, must be a part of a good compression algorithm. This observation bridges a gap between parts of the literature pertaining to NN and data compression, respectively, providing insight into the empirical success of pruning for NN compression. Finally, we propose a novel pruning strategy derived from our information-theoretic formulation and show that it outperforms the relevant baselines on CIFAR-10 and ImageNet datasets.
翻译:我们研究神经网络压缩问题,通过比率扭曲理论的透镜观察压缩比率与NN性能之间的紧张关系。我们选择了反映NN压缩对模型输出的影响的扭曲度量,然后得出率(压缩比率)与扭曲之间的权衡。除了说明NN压缩的理论限度外,这一配方还表明,隐含或明示的\emph{pruning}必须是一个良好的压缩算法的一部分。这一观察弥合了与NNN和数据压缩有关的文献部分之间的鸿沟,从而深入了解NNN压缩的经验性成功。最后,我们提出了一个从我们的信息理论配方中得出的新的裁剪裁战略,并表明它超过了CIFAR-10和图像网络数据集的相关基线。