Pruning Deep Neural Networks (DNNs) is a prominent field of study in the goal of inference runtime acceleration. In this paper, we introduce a novel data-free pruning protocol RED++. Only requiring a trained neural network, and not specific to DNN architecture, we exploit an adaptive data-free scalar hashing which exhibits redundancies among neuron weight values. We study the theoretical and empirical guarantees on the preservation of the accuracy from the hashing as well as the expected pruning ratio resulting from the exploitation of said redundancies. We propose a novel data-free pruning technique of DNN layers which removes the input-wise redundant operations. This algorithm is straightforward, parallelizable and offers novel perspective on DNN pruning by shifting the burden of large computation to efficient memory access and allocation. We provide theoretical guarantees on RED++ performance and empirically demonstrate its superiority over other data-free pruning methods and its competitiveness with data-driven ones on ResNets, MobileNets and EfficientNets.
翻译:深海神经网络(DNNs) 是一个突出的研究领域, 目标是推论运行时间加速。 在本文中, 我们引入了一个新的无数据运行协议RED++。 我们只需要一个经过训练的神经网络, 而不是DNN的架构, 我们利用适应性的数据无卡路里散列, 它在神经重量值中显示出冗余。 我们研究了关于保存散射准确性的理论和经验保障, 以及开发上述冗余状态的预期裁剪率。 我们建议了一种新的无数据运行技术, 消除 DNNT 层的重复性操作。 这种算法简单、 平行, 并且提供了对 DNNN 运行的新视角, 将大型计算的负担转换为高效的记忆存取和分配。 我们对RED++的绩效提供理论保障, 并用经验证明它优于其他无数据运行方法, 及其在ResNet、 移动网络 和 节能网络上的数据驱动的竞争力。