In this work, we propose a low-bit training framework for convolutional neural networks, which is built around a novel multi-level scaling (MLS) tensor format. Our framework focuses on reducing the energy consumption of convolution operations by quantizing all the convolution operands to low bit-width format. Specifically, we propose the MLS tensor format, in which the element-wise bit-width can be largely reduced. Then, we describe the dynamic quantization and the low-bit tensor convolution arithmetic to leverage the MLS tensor format efficiently. Experiments show that our framework achieves a superior trade-off between the accuracy and the bit-width than previous low-bit training frameworks. For training a variety of models on CIFAR-10, using 1-bit mantissa and 2-bit exponent is adequate to keep the accuracy loss within $1\%$. And on larger datasets like ImageNet, using 4-bit mantissa and 2-bit exponent is adequate to keep the accuracy loss within $1\%$. Through the energy consumption simulation of the computing units, we can estimate that training a variety of models with our framework could achieve $8.3\sim10.2\times$ and $1.9\sim2.3\times$ higher energy efficiency than training with full-precision and 8-bit floating-point arithmetic, respectively.
翻译:在这项工作中,我们提出一个低位数的神经神经网络培训框架,这个框架是围绕新的多级缩放(MLS)加压格式建立的。我们的框架侧重于通过将所有革命剧目量化为低位维度格式来减少革命行动的能量消耗。具体地说,我们提议了MLS Exor格式,在这个格式中,元素偏差的位宽可以大大降低。然后,我们描述了动态量化和低位脉冲算术,以便有效地利用MLS 喇叭格式。实验表明,我们的框架在精确度和比先前的低位培训框架的微宽度之间实现了更高的交易。为了在CIFAR-10上培训各种模型,使用1比特曼蒂萨和2比特的推算,足以将精度损失控制在1美元的范围内。在像图像网这样的大数据集上,使用4比曼蒂萨和2比平价的推算,足以将精确度损失控制在1美元的范围内。实验表明,我们的框架在精确度和比以前的低位维值之间实现了更高的交易量和比以前的低维值。 对于CIFAR-10的节能框架,我们可以分别用8/3的训练,我们可以分别用830的模型来进行各种培训。