Effectively structuring deep knowledge plays a pivotal role in transfer from teacher to student, especially in semantic vision tasks. In this paper, we present a simple knowledge structure to exploit and encode information inside the detection system to facilitate detector knowledge distillation. Specifically, aiming at solving the feature imbalance problem while further excavating the missing relation inside semantic instances, we design a graph whose nodes correspond to instance proposal-level features and edges represent the relation between nodes. To further refine this graph, we design an adaptive background loss weight to reduce node noise and background samples mining to prune trivial edges. We transfer the entire graph as encoded knowledge representation from teacher to student, capturing local and global information simultaneously. We achieve new state-of-the-art results on the challenging COCO object detection task with diverse student-teacher pairs on both one- and two-stage detectors. We also experiment with instance segmentation to demonstrate robustness of our method. It is notable that distilled Faster R-CNN with ResNet18-FPN and ResNet50-FPN yields 38.68 and 41.82 Box AP respectively on the COCO benchmark, Faster R-CNN with ResNet101-FPN significantly achieves 43.38 AP, which outperforms ResNet152-FPN teacher about 0.7 AP. Code: https://github.com/dvlab-research/Dsig.
翻译:有效构建深层知识在从教师向学生的转移中发挥着关键作用,特别是在语义视觉任务方面。在本文中,我们提出了一个简单的知识结构,用于在探测系统中开发和编码信息,以便利检测或知识蒸馏。具体地说,我们旨在解决特征不平衡问题,同时进一步挖掘语义中缺失的关系,我们设计了一个图,其节点与例建议层面的特点和边缘相对应,代表节点之间的关系。为了进一步完善这个图,我们设计了一个适应性背景损失重量,以减少节点噪音和背景样本开采,将微小边缘用于淡化。我们把整个图表作为知识代表编码从教师转移到学生,同时获取当地和全球信息。我们取得了关于挑战性COCO物体探测任务的新的最新结果,同时在一和两阶段探测器中与各种学生-教师配对一道,我们用例分解实验来显示我们的方法的稳健健性。我们值得注意的是,用ResNet18-FPN和ResNet50-FPN的R68和41.82 Box AS-ResNet-ResCO AS-R-FC 分别大大超越了AS-R-R-R-FP。