We present "Amenable Sparse Network Investigator" (ASNI) algorithm that utilizes a novel pruning strategy based on a sigmoid function that induces sparsity level globally over the course of one single round of training. The ASNI algorithm fulfills both tasks that current state-of-the-art strategies can only do one of them. The ASNI algorithm has two subalgorithms: 1) ASNI-I, 2) ASNI-II. ASNI-I learns an accurate sparse off-the-shelf network only in one single round of training. ASNI-II learns a sparse network and an initialization that is quantized, compressed, and from which the sparse network is trainable. The learned initialization is quantized since only two numbers are learned for initialization of nonzero parameters in each layer L. Thus, quantization levels for the initialization of the entire network is 2L. Also, the learned initialization is compressed because it is a set consisting of 2L numbers. The special sparse network that can be trained from such a quantized and compressed initialization is called amenable. To the best of our knowledge, there is no other algorithm that can learn a quantized and compressed initialization from which the network is still trainable and is able to solve both pruning tasks. Our numerical experiments show that there is a quantized and compressed initialization from which the learned sparse network can be trained and reach to an accuracy on a par with the dense version. We experimentally show that these 2L levels of quantization are concentration points of parameters in each layer of the learned sparse network by ASNI-I. To corroborate the above, we have performed a series of experiments utilizing networks such as ResNets, VGG-style, small convolutional, and fully connected ones on ImageNet, CIFAR10, and MNIST datasets.
翻译:我们提出“ 精密的 Sparse 网络调查员 ” (ASNI) 算法,该算法使用一种基于一回合训练过程中在全球范围引发宽度水平的细小网络和初始化的细小网络功能。 ASSNI 算法满足了两种任务,而目前最先进的战略只能完成其中一项。 ASSNI 算法有两个子算法:(1) ASNI-I, 2 ASNI-II。 ASNI-I 只在一轮训练中学习一个精确的离流网络。 ASNI-II 学习一个稀疏的网络和初始化的参数,在每一轮训练一次训练中,通过一个稀疏的、压缩的初始化、压缩的网络和初始化的初始化网络。 学到的初始化,只有2个数字,在每层L 初始初始初始化中学习。 IMI 初始化的精度水平是2,从一个精细的网络级化到最精细的网络级化。