Deep neural networks with lower precision weights and operations at inference time have advantages in terms of the cost of memory space and accelerator power. The main challenge associated with the quantization algorithm is maintaining accuracy at low bit-widths. We propose learned gradient linear symmetric quantization (LG-LSQ) as a method for quantizing weights and activation functions to low bit-widths with high accuracy in integer neural network processors. First, we introduce the scaling simulated gradient (SSG) method for determining the appropriate gradient for the scaling factor of the linear quantizer during the training process. Second, we introduce the arctangent soft round (ASR) method, which differs from the straight-through estimator (STE) method in its ability to prevent the gradient from becoming zero, thereby solving the discrete problem caused by the rounding process. Finally, to bridge the gap between full-precision and low-bit quantization networks, we propose the minimize discretization error (MDE) method to determine an accurate gradient in backpropagation. The ASR+MDE method is a simple alternative to the STE method and is practical for use in different uniform quantization methods. In our evaluation, the proposed quantizer achieved full-precision baseline accuracy in various 3-bit networks, including ResNet18, ResNet34, and ResNet50, and an accuracy drop of less than 1% in the quantization of 4-bit weights and 4-bit activations in lightweight models such as MobileNetV2 and ShuffleNetV2.
翻译:精密度较低且在推论时间运行的深神经网络在内存空间和加速器功率的成本方面具有优势。 与量化算法有关的主要挑战在于保持低位宽度的精确度。 我们提出将精益梯度线性线性对称量化法(LG-LSQ)作为一种方法,用以量化重量和激活功能到低位宽度,在整型神经网络处理器中精确度高。 首先,我们采用缩放模拟梯度(SG)方法,以确定培训过程中线性量化器缩放因子的适当梯度。 其次,我们采用弧度柔性软圆(ASR)法(ASR),这种方法不同于直线性线线性估量化法(STE),以防止梯度变为零,从而解决圆性进程造成的离散性问题。 最后,为了缩小全精度和低位量网络网络间的差距,我们提议采用最小离分解差差值梯度(MDE)方法,以确定反向反偏偏偏偏偏度变精度的精确度变精度, 在SSR+E网络中采用简单的方法, 4MDSlieval- 和REdeal- real- real- restal- restal-qional-degradustrislation 4-deal-de-deal- sqal- supal- sq- supal- supaliz- d- supal- d- d- d- d- supal- supal- sq- sq- disalization- disalizalizal- sal- squalizal- supal-d- a- sal-d-d- supd-d-d-d-d-d- a-d-d-d-d-d-d- sal- sal- a la-dal-dal-dal- salizalizalizalizalizal-d-d-d-d-d-d-d- sal- sal- sal- d-d-d- sal- sal- sal- sal- sal- sal- sal- sal- sal-d-d- sal-d-