To adopt convolutional neural networks (CNN) for a range of resource-constrained targets, it is necessary to compress the CNN models by performing quantization, whereby precision representation is converted to a lower bit representation. To overcome problems such as sensitivity of the training dataset, high computational requirements, and large time consumption, post-training quantization methods that do not require retraining have been proposed. In addition, to compensate for the accuracy drop without retraining, previous studies on post-training quantization have proposed several complementary methods: calibration, schemes, clipping, granularity, and mixed-precision. To generate a quantized model with minimal error, it is necessary to study all possible combinations of the methods because each of them is complementary and the CNN models have different characteristics. However, an exhaustive or a heuristic search is either too time-consuming or suboptimal. To overcome this challenge, we propose an auto-tuner known as Quantune, which builds a gradient tree boosting model to accelerate the search for the configurations of quantization and reduce the quantization error. We evaluate and compare Quantune with the random, grid, and genetic algorithms. The experimental results show that Quantune reduces the search time for quantization by approximately 36.5x with an accuracy loss of 0.07 ~ 0.65% across six CNN models, including the fragile ones (MobileNet, SqueezeNet, and ShuffleNet). To support multiple targets and adopt continuously evolving quantization works, Quantune is implemented on a full-fledged compiler for deep learning as an open-sourced project.
翻译:为了对一系列资源受限制的目标采用 convolution 神经神经网络(CNN), 有必要对有线电视新闻网模型进行压缩, 进行量化, 将精确表示转换为较低位表示。 要克服培训数据集的敏感性、 高计算要求和大量时间消耗等问题, 提出了无需再培训的训练后量化方法。 此外, 为了弥补不需再培训的准确性下降, 以往关于培训后量化的研究提出了几种补充方法: 校准、 计划、 剪切、 颗粒和混杂精度。 要生成一个具有最小多错的量化模型, 就必须研究所有可能的方法组合, 因为每个模型都是互补的, CNN模型具有不同的特性。 然而, 详尽或超常的搜索方法要么过于耗时, 要么不需再培训, 为了克服这一挑战, 我们提议一个被称为Qoutunual-tune的自动图, 建立梯度树增强模型, 加速搜索配置, 并减少量化错误 Qonal-nal 5, 我们评估和比较整个Squalalalal-alalalalalalalalal 。