图层折叠:使用激活线性线性化减少神经网络深度 (Layer Folding: Neural Network Depth Reduction using Activation Linearization)

Despite the increasing prevalence of deep neural networks, their applicability in resource-constrained devices is limited due to their computational load. While modern devices exhibit a high level of parallelism, real-time latency is still highly dependent on networks' depth. Although recent works show that below a certain depth, the width of shallower networks must grow exponentially, we presume that neural networks typically exceed this minimal depth to accelerate convergence and incrementally increase accuracy. This motivates us to transform pre-trained deep networks that already exploit such advantages into shallower forms. We propose a method that learns whether non-linear activations can be removed, allowing to fold consecutive linear layers into one. We apply our method to networks pre-trained on CIFAR-10 and CIFAR-100 and find that they can all be transformed into shallower forms that share a similar depth. Finally, we use our method to provide more efficient alternatives to MobileNetV2 and EfficientNet-Lite architectures on the ImageNet classification task.

翻译：尽管深神经网络日益普遍,但由于计算负荷,它们在资源限制装置中的适用性有限。现代装置显示出高度平行,但实时潜伏仍高度依赖网络深度。尽管最近的工程显示,在一定深度下,浅线网络的宽度必须成倍增长,但我们假设,神经网络通常会超过这一最小深度,以加速趋同,并逐步提高准确性。这促使我们将已经利用这些优势的预先训练的深线网络转换为更浅的形式。我们提出了一个方法,以了解是否可以删除非线性启动,允许连续将线性层折叠成一个层。我们将我们的方法应用到预先在CIFAR-10和CIFAR-100上训练过的网络,发现它们都可以被转化成具有类似深度的更浅的网络形式。最后,我们用我们的方法为移动网络2和高效的网络-网络网络分类任务提供更有效的替代方法。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

专知会员服务

26+阅读 · 2020年7月19日

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

【Google大脑】进化正则激活层，Evolving Normalization-Activation Layers

专知会员服务

19+阅读 · 2020年4月9日