Most of the existing works use projection functions for ternary quantization in discrete space. Scaling factors and thresholds are used in some cases to improve the model accuracy. However, the gradients used for optimization are inaccurate and result in a notable accuracy gap between the full precision and ternary models. To get more accurate gradients, some works gradually increase the discrete portion of the full precision weights in the forward propagation pass, e.g., using temperature-based Sigmoid function. Instead of directly performing ternary quantization in discrete space, we push full precision weights close to ternary ones through regularization term prior to ternary quantization. In addition, inspired by the temperature-based method, we introduce a re-scaling factor to obtain more accurate gradients by simulating the derivatives of Sigmoid function. The experimental results show that our method can significantly improve the accuracy of ternary quantization in both image classification and object detection tasks.
翻译:大部分现有作品使用投影功能在离散空间中进行永久定量。 在某些情况下,使用缩放系数和阈值来提高模型的准确性。 但是,用于优化的梯度不准确,导致全精度模型和永久模型之间的明显准确性差距。 为了获得更准确的梯度,有些作品在远端传播通道中逐步增加全精度重量的离散部分,例如使用基于温度的Sigmoid函数。我们不直接在离散空间中进行永久量化,而是在离散空间中直接进行全精度重定,而是在离散空间中通过正规化术语将全精度重量推近于永久值。此外,在基于温度的方法的启发下,我们引入了一个重新标定系数,以便通过模拟Sigmoid函数的衍生物获得更准确的梯度。实验结果表明,我们的方法可以大大提高图像分类和对象探测任务中长期量化的准确性。