In recent years, large-scale deep models have achieved great success, but the huge computational complexity and massive storage requirements make it a great challenge to deploy them in resource-limited devices. As a model compression and acceleration method, knowledge distillation effectively improves the performance of small models by transferring the dark knowledge from the teacher detector. However, most of the existing distillation-based detection methods mainly imitating features near bounding boxes, which suffer from two limitations. First, they ignore the beneficial features outside the bounding boxes. Second, these methods imitate some features which are mistakenly regarded as the background by the teacher detector. To address the above issues, we propose a novel Feature-Richness Score (FRS) method to choose important features that improve generalized detectability during distilling. The proposed method effectively retrieves the important features outside the bounding boxes and removes the detrimental features within the bounding boxes. Extensive experiments show that our methods achieve excellent performance on both anchor-based and anchor-free detectors. For example, RetinaNet with ResNet-50 achieves 39.7% in mAP on the COCO2017 dataset, which even surpasses the ResNet-101 based teacher detector 38.9% by 0.8%.
翻译:近些年来,大型深层模型取得了巨大成功,但巨大的计算复杂性和庞大的储存要求使在资源有限的装置中部署这些模型成为巨大的挑战。作为一种模型压缩和加速方法,知识蒸馏通过传授教师探测器的暗暗知识,有效地改进了小型模型的性能。然而,大多数现有的蒸馏法主要模仿捆绑箱附近的特征,这些特征有两个限制。首先,它们忽视了捆绑箱外的有益特征。第二,这些方法模仿了一些被教师探测器误认为背景的特征。为了解决上述问题,我们建议采用一种新的“地貌特征评分(FRS)”方法,以选择重要特征,改善蒸馏过程中的普遍可探测性。拟议的方法有效地检索了捆绑箱外的重要特征,并消除了捆绑箱内的有害特征。广泛的实验表明,我们的方法在锚基和无固定探测器上都取得了出色的性能。例如,ResNet-50的RetinaNet在CO2017年的 mAP中实现了39.7%的MCO-101%,这甚至以0.8%的教师探测率为基础。