Deep learning approaches have achieved unprecedented performance in visual recognition tasks such as object detection and pose estimation. However, state-of-the-art models have millions of parameters represented as floats which make them computationally expensive and constrain their deployment on hardware such as mobile phones and IoT nodes. Most commonly, activations of deep neural networks tend to be sparse thus proving that models are over parametrized with redundant neurons. Model compression techniques, such as pruning and quantization, have recently shown promising results by improving model complexity with little loss in performance. In this work, we extended pruning, a compression technique that discards unnecessary model connections, and weight sharing techniques for the task of object detection. With our approach, we are able to compress a state-of-the-art object detection model by 30.0% without a loss in performance. We also show that our compressed model can be easily initialized with existing pre-trained weights, and thus is able to fully utilize published state-of-the-art model zoos.
翻译:深层学习方法在物体探测和估计等视觉识别任务中取得了前所未有的业绩。 但是,最先进的模型有数百万个参数作为浮标,这些参数在计算上非常昂贵,限制在移动电话和IoT节点等硬件上的部署。 最常见的是,深神经网络的激活往往稀少,从而证明模型与多余的神经元过于配齐。 模型压缩技术,例如裁剪和四分化,最近通过改进模型复杂性和稍有性能损失而显示出有希望的结果。 在这项工作中,我们扩展了裁剪技术,即放弃不必要的模型连接的压缩技术,以及物体探测任务的重量共享技术。在我们的方法下,我们能够将最先进的天体探测模型压缩30.0%,而不会造成性能损失。 我们还表明,我们的压缩模型可以很容易地以现有的预先训练重量初始,从而能够充分利用已出版的状态模型动物园。