Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream," specialized datasets. Generally, it is understood that more accurate models on the "upstream" dataset will provide better transfer accuracy "downstream". In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset, which have been pruned - that is, compressed by sparsifiying their connections. Specifically, we consider transfer using unstructured pruned models obtained by applying several state-of-the-art pruning methods, including magnitude-based, second-order, re-growth and regularization approaches, in the context of twelve standard transfer tasks. In a nutshell, our study shows that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities, and, while doing so, can lead to significant inference and even training speedups. At the same time, we observe and analyze significant differences in the behaviour of different pruning methods.
翻译:传输学习是一种典型的范例,通过这种模式,在大型“ 上流” 数据集上预先培训的模型能够适应在“ 下流” 和专门数据集上产生良好结果。 一般而言,人们理解,在“ 上流” 数据集上更精确的模型将提供更好的传输准确性“ 下游 ” 。 在这项工作中,我们深入地调查了在图像网络数据集上受过培训的“ 进化神经网络” 背景下的这一现象,这些网络已被扎根—— 也就是通过对其连接进行垃圾过滤而压缩。具体地说,我们考虑使用通过应用若干最先进的运行方法,包括基于数量、第二顺序、再增长和正规化的方法,在12项标准传输任务中,获得的非结构化的调整模型,从而产生良好的效果。 在一项研究中,我们的研究显示,即使高度紧张的模型也能够匹配甚至超过密度模型的传输性能,同时可以导致重大的推断甚至培训速度。 同时,我们观察并分析不同运行方法行为中的重大差异。