采用比位位数量化以尽量减少网络量化损失 (Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss)

Network quantization, which aims to reduce the bit-lengths of the network weights and activations, has emerged for their deployments to resource-limited devices. Although recent studies have successfully discretized a full-precision network, they still incur large quantization errors after training, thus giving rise to a significant performance gap between a full-precision network and its quantized counterpart. In this work, we propose a novel quantization method for neural networks, Cluster-Promoting Quantization (CPQ) that finds the optimal quantization grids while naturally encouraging the underlying full-precision weights to gather around those quantization grids cohesively during training. This property of CPQ is thanks to our two main ingredients that enable differentiable quantization: i) the use of the categorical distribution designed by a specific probabilistic parametrization in the forward pass and ii) our proposed multi-class straight-through estimator (STE) in the backward pass. Since our second component, multi-class STE, is intrinsically biased, we additionally propose a new bit-drop technique, DropBits, that revises the standard dropout regularization to randomly drop bits instead of neurons. As a natural extension of DropBits, we further introduce the way of learning heterogeneous quantization levels to find proper bit-length for each layer by imposing an additional regularization on DropBits. We experimentally validate our method on various benchmark datasets and network architectures, and also support a new hypothesis for quantization: learning heterogeneous quantization levels outperforms the case using the same but fixed quantization levels from scratch.

翻译：网络量化旨在降低网络重量和激活量的比特长度, 已经出现, 用于将其配置到资源有限的装置。尽管最近的研究成功地将一个全精度网络分解, 培训后仍然会出现巨大的量化错误, 从而在全精度网络与其量化对应方之间产生巨大的性能差距。在这项工作中, 我们为神经网络提出了一种新的量化方法, 即集群促进量化( CPQ), 找到最佳的量化网, 同时自然鼓励基本全精度加权, 在培训期间将精度网集中聚集在这些量化网周围。 CPQ的这种属性要归功于我们两个主要成份, 使得可以进行不同的量化: i) 使用一个特定的精度精度匹配网络与其量化对等的绝对分布; ii) 我们提议的多级直通度估算值( STQQ) 。由于我们的第二个组件, 多级STE, 具有内在的偏差, 我们提议在二次递增定精度的精确度网络结构中, 我们又提议一个新的比位的正定的递定度方法。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日