Model binarization is an effective method of compressing neural networks and accelerating their inference process, which enables state-of-the-art models to run on resource-limited devices. However, a significant performance gap still exists between the 1-bit model and the 32-bit one. The empirical study shows that binarization causes a great loss of information in the forward and backward propagation which harms the performance of binary neural networks (BNNs), and the limited information representation ability of binarized parameter is one of the bottlenecks of BNN performance. We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients, which improves BNNs by distribution-sensitive optimization without increasing the overhead in the inference process. The DIR-Net mainly relies on two technical contributions: (1) Information Maximized Binarization (IMB): minimizing the information loss and the quantization error of weights/activations simultaneously by balancing and standardizing the weight distribution in the forward propagation; (2) Distribution-sensitive Two-stage Estimator (DTE): minimizing the information loss of gradients by gradual distribution-sensitive approximation of the sign function in the backward propagation, jointly considering the updating capability and accurate gradient. The DIR-Net investigates both forward and backward processes of BNNs from the unified information perspective, thereby provides new insight into the mechanism of network binarization. Comprehensive experiments on CIFAR-10 and ImageNet datasets show our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures. Additionally, we conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
翻译:模型二进制是压缩神经网络并加速其神经网络的有效方法,使最先进的网络模型能够在资源有限的装置上运行。然而,1比特模型和32比特模型之间仍然存在着巨大的绩效差距。经验研究表明,二进制在前向和后向传播中造成了信息的巨大损失,损害了双进神经网络的性能,而二进制参数的信息代表能力有限是BNN性功能的瓶颈之一。我们展示了一个新型的对分销敏感的信息保留网(DIR-Net),以保留关于前进主流启动和后向梯度模型的信息,通过对分布有敏感认识的模型和32比特位模型之间仍然存在显著的绩效差距。DIR-Net主要依靠两种技术贡献:(1) 信息最大化的神经网络(IMB):通过平衡和标准化前进式传播中的权重分配敏感信息;(2) 分销敏感二阶段的DR-Retain Retain 服务器(DTE-IPR),通过配置对前进进化的系统流数据流流流流运行,通过Scial-demodeal the develrial develrial lial develilal lapsal laps sal demod sal sal sal demotion sal sal sal sal sal sal selational selations slations sirmal sal ex ex ex ex ex sirmalts,通过Sir sirmals 提供我们Sirmalts,通过Sirmal AS AS ASmalts,通过Sirts,通过测试,通过Sirmaltaltaldals AS AS AS ASlational-saldaltaltaltaldal 机能能进行我们BY 机能,通过测试提供机算。 机能 机能,通过测试,通过测试,通过测试,通过测试,通过Sirmaltaldal deal deal deal 机能进行进行进行进行进行进行进行。