Machine learning models have achieved human-level performance on various tasks. This success comes at a high cost of computation and storage overhead, which makes machine learning algorithms difficult to deploy on edge devices. Typically, one has to partially sacrifice accuracy in favor of an increased performance quantified in terms of reduced memory usage and energy consumption. Current methods compress the networks by reducing the precision of the parameters or by eliminating redundant ones. In this paper, we propose a new insight into network compression through the Bayesian framework. We show that Bayesian neural networks automatically discover redundancy in model parameters, thus enabling self-compression, which is linked to the propagation of uncertainty through the layers of the network. Our experimental results show that the network architecture can be successfully compressed by deleting parameters identified by the network itself while retaining the same level of accuracy.
翻译:机器学习模型在各种任务中实现了人文层面的绩效。 这一成功以高昂的计算和存储成本为代价,使得机器学习算法难以在边缘设备上部署。 通常,为了提高用记忆用量和能源消耗量来量化的性能,必须部分牺牲准确性。 目前的方法通过降低参数的精确度或消除冗余参数来压缩网络。 在本文中,我们提议通过巴耶斯框架对网络压缩进行新的了解。 我们显示,巴耶斯神经网络自动发现模型参数中的冗余,从而能够实现自我压缩,这与通过网络层传播不确定性有关。 我们的实验结果表明,网络结构可以通过删除网络本身确定的参数而同时保持同样的准确度而成功压缩。