DeepShift: 迈向乘法-Less神经网络 (DeepShift: Towards Multiplication-Less Neural Networks)

from arxiv, -Added results for 8-bit and 16-bit fixed point activations, as well as 5-bit, 4-bit, 3-bit, and 2-bit weights. - Added link to GitHub code - Updated and fixed the training algorithm - Introduced 2 approaches for backward and forward pases - Showed better results for training from scratch on CIFAR10 and Imagenet - Added implementation on NVIDIA's GPU -Accepted in CVPR Mobile AI 2021 Workshop

The high computation, memory, and power budgets of inferring convolutional neural networks (CNNs) are major bottlenecks of model deployment to edge computing platforms, e.g., mobile devices and IoT. Moreover, training CNNs is time and energy-intensive even on high-grade servers. Convolution layers and fully connected layers, because of their intense use of multiplications, are the dominant contributor to this computation budget. We propose to alleviate this problem by introducing two new operations: convolutional shifts and fully-connected shifts which replace multiplications with bitwise shift and sign flipping during both training and inference. During inference, both approaches require only 5 bits (or less) to represent the weights. This family of neural network architectures (that use convolutional shifts and fully connected shifts) is referred to as DeepShift models. We propose two methods to train DeepShift models: DeepShift-Q which trains regular weights constrained to powers of 2, and DeepShift-PS that trains the values of the shifts and sign flips directly. Very close accuracy, and in some cases higher accuracy, to baselines are achieved. Converting pre-trained 32-bit floating-point baseline models of ResNet18, ResNet50, VGG16, and GoogleNet to DeepShift and training them for 15 to 30 epochs, resulted in Top-1/Top-5 accuracies higher than that of the original model. Last but not least, we implemented the convolutional shifts and fully connected shift GPU kernels and showed a reduction in latency time of 25% when inferring ResNet18 compared to unoptimized multiplication-based GPU kernels. The code can be found at https://github.com/mostafaelhoushi/DeepShift.

翻译：计算进化神经网络(CNNs)的高计算、记忆和动力预算是模型部署的主要瓶颈。此外, CNN 培训的时间和能量密集度甚至高端服务器上也是时间和能源密集型的。熔化层和完全连接的层由于大量使用乘数,是本计算预算的主要促成者。我们提议通过引入两个新操作来缓解这一问题: 螺旋转移和完全连通的转换, 取代倍增, 在培训和感应期间, 以比对更明智的转换和标记翻转。在推断期间, 两种方法只需要5位( 或更少) 来代表重量。这种神经网络结构( 使用革命性转变和完全连通的转换) 被称为“ 深希夫特模型 ” 。我们提出两种方法来培训深希夫特模型: 深希夫特- Q 将常规重量提高到2级, 而 Deep Shift- PSBS, 则在培训和直接显示变换值的时候, 深思托( 更精确性) 18 和底基底基底基底的GGFIFIB) 模型, 开始后, 开始。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【KDD2020】更深的图神经网络，Towards Deeper Graph Neural Networks

专知会员服务

90+阅读 · 2020年7月22日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日