Network pruning is an effective approach to reduce network complexity without performance compromise. Existing studies achieve the sparsity of neural networks via time-consuming weight tuning or complex search on networks with expanded width, which greatly limits the applications of network pruning. In this paper, we show that high-performing and sparse sub-networks without the involvement of weight tuning, termed "lottery jackpots", exist in pre-trained models with unexpanded width. For example, we obtain a lottery jackpot that has only 10% parameters and still reaches the performance of the original dense VGGNet-19 without any modifications on the pre-trained weights. Furthermore, we observe that the sparse masks derived from many existing pruning criteria have a high overlap with the searched mask of our lottery jackpot, among which, the magnitude-based pruning results in the most similar mask with ours. Based on this insight, we initialize our sparse mask using the magnitude pruning, resulting in at least 3x cost reduction on the lottery jackpot search while achieves comparable or even better performance. Specifically, our magnitude-based lottery jackpot removes 90% weights in the ResNet-50, while easily obtains more than 70% top-1 accuracy using only 10 searching epochs on ImageNet.
翻译:网络运行是一种有效的方法,可以降低网络复杂性,而不会影响业绩。 现有的研究通过时间耗重的重量调整或对宽度扩大的网络进行复杂搜索,实现了神经网络的广度,这极大地限制了网络运行的应用。 在本文中,我们表明,在未经开发宽度未变宽的预培训模型中,存在着高性能和稀散的子网络,没有重力调,称为“彩虹彩虹彩虹”,在未经开发的宽度未变宽度的预培训模式中,我们获得的是一张彩虹头,它只有10%的参数,仍然达到原始密集的VGGGNet-19的性能,而未对预先培训的重量作任何修改。此外,我们观察到,从许多现有裁剪裁标准中得来的稀薄面具与我们彩虹头罩的搜索面具有很大重叠。 其中,在与我们最相似的遮罩中,基于规模的彩虹彩虹彩虹,我们利用规模裁剪裁剪裁,导致彩票搜索至少减少3x成本,同时取得类似或甚至更好的性能。 。 具体而言,我们基于重量的彩虹彩虹彩票的彩虹头只只只只只只只去除只只只只只消除了70-50的重量,而在Res搜索中,而仅获得10-100的图像精精精度只。