具有可预见担保的神经网络的量化 (Post-training Quantization for Neural Networks with Provable Guarantees)

While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized (e.g., 4-bit, or binary) counterparts, massive savings in computation cost, memory, and power consumption are attained. To that end, we generalize a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism. Among other things, we propose modifications to promote sparsity of the weights, and rigorously analyze the associated error. Additionally, our error analysis expands the results of previous work on GPFQ to handle general quantization alphabets, showing that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights -- i.e., level of over-parametrization. Our result holds across a range of input distributions and for both fully-connected and convolutional architectures thereby also extending previous results. To empirically evaluate the method, we quantize several common architectures with few bits per weight, and test them on ImageNet, showing only minor loss of accuracy compared to unquantized models. We also demonstrate that standard modifications, such as bias correction and mixed precision quantization, further improve accuracy.

翻译：虽然神经网络在广泛的应用中取得了显著的成功,但是在资源限制的硬件中实施这些网络仍然是一个密集的研究领域。此外,我们的错误分析扩大了GPFQ以往处理一般四分化字母表的工作结果,表明对单层网络进行四分化,相对的平方错误基本上在重量数上直线衰减 -- -- 即超分化的程度。我们的结果存在于一系列投入分布中,完全相连的和进化的结构也因此扩大了先前的结果。为了对方法进行实证评估,我们量化了GPFQ以往处理一般四分化字母表的工作结果,表明对单层网络进行四分化,相对的平方错误基本上使重量数(即超分化的程度)下降。我们的结果存在于各种投入分布中,同时也是为了全面连接和进化结构,从而也扩大了先前的结果。为了对方法进行实证评估,我们量化了GPFQQ处理一般四分式字母表的工作结果,表明对于单层网络的量化,相对的正方差差差基本上使一些共同结构的精确性得到改进。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日