Quantizing deep convolutional neural networks for image super-resolution substantially reduces their computational costs. However, existing works either suffer from a severe performance drop in ultra-low precision of 4 or lower bit-widths, or require a heavy fine-tuning process to recover the performance. To our knowledge, this vulnerability to low precisions relies on two statistical observations of feature map values. First, distribution of feature map values varies significantly per channel and per input image. Second, feature maps have outliers that can dominate the quantization error. Based on these observations, we propose a novel distribution-aware quantization scheme (DAQ) which facilitates accurate training-free quantization in ultra-low precision. A simple function of DAQ determines dynamic range of feature maps and weights with low computational burden. Furthermore, our method enables mixed-precision quantization by calculating the relative sensitivity of each channel, without any training process involved. Nonetheless, quantization-aware training is also applicable for auxiliary performance gain. Our new method outperforms recent training-free and even training-based quantization methods to the state-of-the-art image super-resolution networks in ultra-low precision.
翻译:对图像超分辨率的深度进化神经网络进行量化,大大降低了它们的计算成本。然而,现有的工程要么在特低精度4或低位比特宽下出现严重性能下降,要么需要严格的微调程序才能恢复性能。据我们所知,低精度的这种脆弱性取决于地貌地图值的两次统计观测。首先,每个频道和每个输入图像的地貌地图值分布差异很大。第二,地貌地图的外部值可以控制量化错误。根据这些观察,我们提议了一个新的分布觉量化计划(DAQ),它有利于在超低精度方面准确的无培训量化。DAQ的简单功能决定了地貌地图和重量的动态范围,而计算负担较低。此外,我们的方法通过计算每个频道的相对灵敏度而无需任何培训过程,使得混精度的二次定量。然而,量化认知培训也适用于辅助性绩效增益。我们的新方法超越了在超低精确度图像网络中最近的无培训甚至基于培训的二次量化方法。