神经网络压缩的可编程方法 (A Programmable Approach to Neural Network Compression)

from arxiv, This is an updated version of a paper published in IEEE Micro, vol. 40, no. 5, pp. 17-25, Sept.-Oct. 2020 at https://ieeexplore.ieee.org/document/9151283

Deep neural networks (DNNs) frequently contain far more weights, represented at a higher precision, than are required for the specific task which they are trained to perform. Consequently, they can often be compressed using techniques such as weight pruning and quantization that reduce both the model size and inference time without appreciable loss in accuracy. However, finding the best compression strategy and corresponding target sparsity for a given DNN, hardware platform, and optimization objective currently requires expensive, frequently manual, trial-and-error experimentation. In this paper, we introduce a programmable system for model compression called Condensa. Users programmatically compose simple operators, in Python, to build more complex and practically interesting compression strategies. Given a strategy and user-provided objective (such as minimization of running time), Condensa uses a novel Bayesian optimization-based algorithm to automatically infer desirable sparsities. Our experiments on four real-world DNNs demonstrate memory footprint and hardware runtime throughput improvements of 188x and 2.59x, respectively, using at most ten samples per search. We have released a reference implementation of Condensa at https://github.com/NVlabs/condensa.

翻译：深心神经网络(DNNS)通常包含远得多的重量,代表的精确度远高于所培训的具体任务所需的重量,比它们所要完成的具体任务所需的精确度高。因此,它们往往可以使用一些技术来压缩,例如重量裁剪和量化技术,这种技术可以减少模型大小和推算时间,而不会明显地造成准确性损失。然而,找到最佳压缩战略和对特定DNN、硬件平台和优化目标的相应目标宽度,目前需要花费昂贵、经常人工操作、试验和过硬实验。在本文中,我们采用了一种称为Condensa的模型压缩程序可编程系统。用户在Python用程序将简单的操作操作员编组成更复杂、更实际有趣的压缩战略。考虑到战略和用户提供的目标(例如最大限度地减少运行时间),Condensa使用一种新颖的Bayesian优化算法自动推导出理想的宽度。我们在四个真实世界的DNNNNPS的实验中分别显示了188x和2.59x的记忆足迹和硬件运行时间改进,每次搜索时使用最多10个样本。我们发布了CCDdenus/NVA/NVA/Congisa。

相关内容

Neural Networks

关注 1647

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

专知会员服务

41+阅读 · 2020年2月26日