The existing solutions for object detection distillation rely on the availability of both a teacher model and ground-truth labels. We propose a new perspective to relax this constraint. In our framework, a student is first trained with pseudo labels generated by the teacher, and then fine-tuned using labeled data, if any available. Extensive experiments demonstrate improvements over existing object detection distillation algorithms. In addition, decoupling the teacher and ground-truth distillation in this framework provides interesting properties such: as 1) using unlabeled data to further improve the student's performance, 2) combining multiple teacher models of different architectures, even with different object categories, and 3) reducing the need for labeled data (with only 20% of COCO labels, this method achieves the same performance as the model trained on the entire set of labels). Furthermore, a by-product of this approach is the potential usage for domain adaptation. We verify these properties through extensive experiments.
翻译:现有物体探测蒸馏的解决方案取决于教师模型和地面实况标签的可用性。 我们提出新的视角来放松这一限制。 在我们的框架内, 学生首先接受由教师生成的假标签培训, 然后使用标签数据( 如果有的话)进行微调。 广泛的实验表明现有物体探测蒸馏算法的改进。 此外, 将教师和地面实况蒸馏法脱钩提供了有趣的属性, 例如:(1) 使用未标记的数据来进一步提高学生的性能;(2) 将不同建筑的多种教师模型(即使是不同对象类别)结合起来, 以及(3) 减少对标签数据的需求( COCO标签只有20%), 这种方法达到与整个标签组所培训的模型相同的性能。 此外, 这种方法的副产品是领域适应的潜在用途。 我们通过广泛的实验来核查这些属性。