Adder Neural Network (AdderNet) provides a new way for developing energy-efficient neural networks by replacing the expensive multiplications in convolution with cheaper additions (i.e.l1-norm). To achieve higher hardware efficiency, it is necessary to further study the low-bit quantization of AdderNet. Due to the limitation that the commutative law in multiplication does not hold in l1-norm, the well-established quantization methods on convolutional networks cannot be applied on AdderNets. Thus, the existing AdderNet quantization techniques propose to use only one shared scale to quantize both the weights and activations simultaneously. Admittedly, such an approach can keep the commutative law in the l1-norm quantization process, while the accuracy drop after low-bit quantization cannot be ignored. To this end, we first thoroughly analyze the difference on distributions of weights and activations in AdderNet and then propose a new quantization algorithm by redistributing the weights and the activations. Specifically, the pre-trained full-precision weights in different kernels are clustered into different groups, then the intra-group sharing and inter-group independent scales can be adopted. To further compensate the accuracy drop caused by the distribution difference, we then develop a lossless range clamp scheme for weights and a simple yet effective outliers clamp strategy for activations. Thus, the functionality of full-precision weights and the representation ability of full-precision activations can be fully preserved. The effectiveness of the proposed quantization method for AdderNet is well verified on several benchmarks, e.g., our 4-bit post-training quantized adder ResNet-18 achieves an 66.5% top-1 accuracy on the ImageNet with comparable energy efficiency, which is about 8.5% higher than that of the previous AdderNet quantization methods.
翻译:添加神经网络( AdderNet) 为开发节能神经网络提供了一种新的方法。 因此, 现有的 AdderNet 量化技术建议只使用一个共享的比重, 以更便宜的附加( 即 l1- 诺尔姆) 来量化精度变速率的倍化。 要实现更高的硬件效率, 就必须进一步研究 AdderNet 的低位四分化。 由于乘法在 l1 向量中不具有一定的偏差, 平流网络上已确立的四分化方法无法在 AdderNet 上应用。 因此, 现有的 aderNet 量化技术建议只使用一个共享的比重来量化精度的精度, 将精度变速的精度表示精度表示精度的精度表示精度表示精度, 将精度的精度表示精度递精度的精度比精度, 将精细的精度表示精度表示不同比例的分布法的精度, 将精度排序的精度进行一个精度排序, 。