Although multi-task deep neural network (DNN) models have computation and storage benefits over individual single-task DNN models, they can be further optimized via model compression. Numerous structured pruning methods are already developed that can readily achieve speedups in single-task models, but the pruning of multi-task networks has not yet been extensively studied. In this work, we investigate the effectiveness of structured pruning on multi-task models. We use an existing single-task filter pruning criterion and also introduce an MTL-based filter pruning criterion for estimating the filter importance scores. We prune the model using an iterative pruning strategy with both pruning methods. We show that, with careful hyper-parameter tuning, architectures obtained from different pruning methods do not have significant differences in their performances across tasks when the number of parameters is similar. We also show that iterative structure pruning may not be the best way to achieve a well-performing pruned model because, at extreme pruning levels, there is a high drop in performance across all tasks. But when the same models are randomly initialized and re-trained, they show better results.
翻译:尽管多任务深度神经网络 (DNN) 模型在计算和存储方面比单独的单任务 DNN 模型具有优势,但可以通过模型压缩进一步进行优化。已经开发了许多结构化剪枝方法,可以轻松地在单任务模型中实现加速,但还没有广泛研究多任务网络的修剪。在这项工作中,我们研究了结构修剪在多任务模型上的有效性。我们使用现有的单任务过滤器修剪标准,并引入了基于 MTL 的过滤器修剪标准来估计过滤器重要性分数。我们使用迭代剪枝策略使用两种修剪方法进行修剪模型。我们表明,在仔细的超参数调整下,使用不同修剪方法获得的架构在任务之间的性能没有显着差异,当参数数量相同时。我们还表明,迭代结构修剪可能不是实现性能良好的修剪模型的最佳方式,因为在极端修剪水平时,所有任务的性能都会大幅下降。但当重新初始化并重新训练相同的模型时,则会显示出更好的结果。