有效培训具有低比比重重量和低活力的革命神经网络 (Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations)

This paper tackles the problem of training a deep convolutional neural network of both low-bitwidth weights and activations. Optimizing a low-precision network is very challenging due to the non-differentiability of the quantizer, which may result in substantial accuracy loss. To address this, we propose three practical approaches, including (i) progressive quantization; (ii) stochastic precision; and (iii) joint knowledge distillation to improve the network training. First, for progressive quantization, we propose two schemes to progressively find good local minima. Specifically, we propose to first optimize a net with quantized weights and subsequently quantize activations. This is in contrast to the traditional methods which optimize them simultaneously. Furthermore, we propose a second progressive quantization scheme which gradually decreases the bit-width from high-precision to low-precision during training. Second, to alleviate the excessive training burden due to the multi-round training stages, we further propose a one-stage stochastic precision strategy to randomly sample and quantize sub-networks while keeping other parts in full-precision. Finally, we adopt a novel learning scheme to jointly train a full-precision model alongside the low-precision one. By doing so, the full-precision model provides hints to guide the low-precision model training and significantly improves the performance of the low-precision network. Extensive experiments on various datasets (e.g., CIFAR-100, ImageNet) show the effectiveness of the proposed methods.

翻译：本文解决了培训低比位权重和激活的深层神经神经网络的问题。优化低精度网络非常具有挑战性, 因为量化器没有差异, 可能导致大量精度损失。为了解决这个问题, 我们建议了三种实用的方法, 包括:(一) 渐进量度;(二) 随机精确度;(三) 联合知识蒸馏, 以改善网络培训。首先, 为了逐步量化, 我们建议了两种办法, 以逐渐找到良好的本地迷你。具体地说, 我们提议首先优化一个带有量子精度的网络网络网络网络, 并随后对启动量进行量度。这与同时优化它们的传统方法不同。此外, 我们提议了第二个渐进量化方法, 逐步将位精度从高精度降低到低精度。其次, 为了减轻由于多轮培训阶段而导致的过度模式培训负担, 我们进一步提议了一个一阶段的精度精确度战略, 随机地取样, 并随后对子网络的精度进行量度测试。这与同时优化的传统的传统方法, 将低精度的精度引入其它部分。