In the era of Deep Neural Network based solutions for a variety of real-life tasks, having a compact and energy-efficient deployable model has become fairly important. Most of the existing deep architectures use Rectifier Linear Unit (ReLU) activation. In this paper, we propose a novel idea of rotating the ReLU activation to give one more degree of freedom to the architecture. We show that this activation wherein the rotation is learned via training results in the elimination of those parameters/filters in the network which are not important for the task. In other words, rotated ReLU seems to be doing implicit sparsification. The slopes of the rotated ReLU activations act as coarse feature extractors and unnecessary features can be eliminated before retraining. Our studies indicate that features always choose to pass through a lesser number of filters in architectures such as ResNet and its variants. Hence, by rotating the ReLU, the weights or the filters that are not necessary are automatically identified and can be dropped thus giving rise to significant savings in memory and computation. Furthermore, in some cases, we also notice that along with saving in memory and computation we also obtain improvement over the reported performance of the corresponding baseline work in the popular datasets such as MNIST, CIFAR-10, CIFAR-100, and SVHN.
翻译:在深心网络时代,基于深心网络的解决方案为各种现实生活任务提供了各种解决方案,拥有一个紧凑和节能的可部署模式,这些解决方案已经变得相当重要。现有的深层建筑大多使用 Recitation Linerar 单元(RELU)启动。在本文件中,我们提出了一个新颖的想法,即旋转 ReLU 启动功能,给该架构带来一个更高程度的自由。我们表明,这种启动,即通过培训学习,在网络中消除那些对任务无关紧要的参数/过滤器;换言之,旋转的ReLU似乎正在进行隐蔽的扩张。旋转的RELU启动功能的斜坡在再培训之前可以消除粗糙的特征提取器和不必要的特征特征。我们的研究表明,各种特征总是选择通过ResNet及其变体等架构中较少的过滤器。因此,通过对RELU进行轮换,不需要的重量或过滤器被自动确定,从而在记忆和计算方面实现显著的节约。此外,在某些情况下,我们还注意到,随着记忆中的保存和计算,S-10-RFAR的基线工作也取得了我们所汇报的成绩。