We propose a new computationally efficient method for quantizing the weights of pre- trained neural networks that is general enough to handle both multi-layer perceptrons and convolutional neural networks. Our method deterministically quantizes layers in an iterative fashion with no complicated re-training required. Specifically, we quantize each neuron, or hidden unit, using a greedy path-following algorithm. This simple algorithm is equivalent to running a dynamical system, which we prove is stable for quantizing a single-layer neural network (or, alternatively, for quantizing the first layer of a multi-layer network) when the training data are Gaussian. We show that under these assumptions, the quantization error decays with the width of the layer, i.e., its level of over-parametrization. We provide numerical experiments, on multi-layer networks, to illustrate the performance of our methods on MNIST and CIFAR10 data, as well as for quantizing the VGG16 network using ImageNet data.
翻译:我们提出了一个新的计算高效方法,用于量化受过训练的神经网络的重量,该方法很一般,足以处理多层感应器和进化神经网络的重量。我们的方法是以迭接方式确定层数,不需要复杂的再培训。具体地说,我们利用贪婪的路径跟踪算法对每个神经元或隐藏单元进行量化。这一简单算法相当于运行一个动态系统,我们证明,当培训数据为高斯人时,这种系统对于单层神经网络(或者多层网络第一层的量化)的量化是稳定的。我们显示,根据这些假设,四分化错误会随着层宽度的衰变,即其超分化程度。我们在多层网络上提供数字实验,以说明我们在MNIST和CIFAR10数据上的方法的性能,以及利用图像网络数据对VG16网络进行量化。