Incremental learning methods can learn new classes continually by distilling knowledge from the last model (as a teacher model) to the current model (as a student model) in the sequentially learning process. However, these methods cannot work for Incremental Implicitly-Refined Classification (IIRC), an incremental learning extension where the incoming classes could have two granularity levels, a superclass label and a subclass label. This is because the previously learned superclass knowledge may be occupied by the subclass knowledge learned sequentially. To solve this problem, we propose a novel Multi-Teacher Knowledge Distillation (MTKD) strategy. To preserve the subclass knowledge, we use the last model as a general teacher to distill the previous knowledge for the student model. To preserve the superclass knowledge, we use the initial model as a superclass teacher to distill the superclass knowledge as the initial model contains abundant superclass knowledge. However, distilling knowledge from two teacher models could result in the student model making some redundant predictions. We further propose a post-processing mechanism, called as Top-k prediction restriction to reduce the redundant predictions. Our experimental results on IIRC-ImageNet120 and IIRC-CIFAR100 show that the proposed method can achieve better classification accuracy compared with existing state-of-the-art methods.
翻译:递增学习方法可以通过从上一个模式(教师模式)向当前模式(学生模式)在顺序学习过程中将知识从上一个模式(作为教师模式)提炼到目前的模式(作为学生模式),不断学习新课程。然而,这些方法无法用于递增隐性精密分类(IIRC),即一个递进班可以有两个颗粒级水平、一个超级类标签和一个子类标签的递增学习扩展。这是因为以前学到的超级类知识可能由连续学习的亚类知识所占据。为了解决这个问题,我们提出了一个新的多师资知识蒸馏(MTKD)战略。为了保存亚类知识,我们使用最后一个模式作为普通教师来为学生模式提炼先前的知识。为了保存超级类知识,我们使用初始模型作为超级类教师来提炼超级类知识,因为初始模型包含丰富的超级类知识。然而,从两个教师模式中提取的知识可能导致学生模型做出一些多余的预测。我们进一步提议一个后处理机制,称为“顶级预测限制”来减少冗余的预测。我们用IIRC的精确度方法的实验结果显示II-RM-100比较的II-RM方法。