Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the prediction logits due to its inefficiency in distilling the localization information. In this paper, we investigate whether logit mimicking always lags behind feature imitation. Towards this goal, we first present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student. Second, we introduce the concept of valuable localization region that can aid to selectively distill the classification and localization knowledge for a certain region. Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years. The thorough studies exhibit the great potential of logit mimicking that can significantly alleviate the localization ambiguity, learn robust feature representation, and ease the training difficulty in the early stage. We also provide the theoretical connection between the proposed LD and the classification KD, that they share the equivalent optimization effect. Our distillation scheme is simple as well as effective and can be easily applied to both dense horizontal object detectors and rotated object detectors. Extensive experiments on the MS COCO, PASCAL VOC, and DOTA benchmarks demonstrate that our method can achieve considerable AP improvement without any sacrifice on the inference speed. Our source code and pretrained models are publicly available at https://github.com/HikariTJU/LD.
翻译:先前的物体探测知识蒸馏法(KD)主要侧重于特征仿制,而不是模仿预测记录,原因是在蒸馏本地化信息方面效率低下。在本文中,我们调查了模拟逻辑是否总是落后于特征仿制。为了实现这一目标,我们首先展示了一种新的本地化蒸馏法(LD)方法,它能够有效地将教师的本地化知识从教师转移到学生手中。第二,我们引入了宝贵的本地化区域概念,它可以帮助有选择地为某个区域提取分类和地方化知识。我们首次将这两个新组成部分合并起来,表明模拟逻辑模拟功能可以超越功能仿造功能,而没有本地化蒸馏法,这是多年来在本地化法下模拟本地化功能的关键原因。透彻的研究显示,逻辑模拟极有可能大大减轻本地化的模糊性,学习强大的地貌代表,并在早期缓解培训困难。我们还提供了拟议的LD和分类之间的理论联系,在等量级KD中分享了等效性特征特征特征特征,在等效的模拟中分享了模拟特征特征特征特征特征特征特征特征特征特征特征,而没有进行等等效地进行等效的图像。我们的现有水平的实验室/ 水平级的实验室 升级系统,可以简单的实验室/,可以用来测量和升级的实验室的实验室,可以实现我们的水平级研磨制成。