The ever-increasing computational complexity of deep learning models makes their training and deployment difficult on various cloud and edge platforms. Replacing floating-point arithmetic with low-bit integer arithmetic is a promising approach to save energy, memory footprint, and latency of deep learning models. As such, quantization has attracted the attention of researchers in recent years. However, using integer numbers to form a fully functional integer training pipeline including forward pass, back-propagation, and stochastic gradient descent is not studied in detail. Our empirical and mathematical results reveal that integer arithmetic seems to be enough to train deep learning models. Unlike recent proposals, instead of quantization, we directly switch the number representation of computations. Our novel training method forms a fully integer training pipeline that does not change the trajectory of the loss and accuracy compared to floating-point, nor does it need any special hyper-parameter tuning, distribution adjustment, or gradient clipping. Our experimental results show that our proposed method is effective in a wide variety of tasks such as classification (including vision transformers), object detection, and semantic segmentation.
翻译:深层次学习模型日益复杂的计算复杂性使得它们难以在不同的云层和边缘平台上进行培训和部署。用低位整数算术取代浮点算术是节省能源、记忆足迹和深层学习模型长期性的一个很有希望的方法。因此,量化近年来吸引了研究人员的注意。然而,使用整数形成一个功能齐全的整数培训管道,包括前通过、反向转换和随机梯度下降。我们的经验和数学结果显示,全数算术似乎足以训练深层学习模型。与最近的建议不同,我们直接交换计算的数字。我们的新培训方法形成了一个完全整齐的培训管道,它不会改变损失和准确度与浮点相比的轨迹,也不需要任何特殊的超单数调整、分布调整或梯度剪切。我们的实验结果表明,我们拟议的方法在诸如分类(包括视觉变异器)、物体探测和语义分割等广泛的任务中是有效的。