Mutual knowledge distillation (MKD) improves a model by distilling knowledge from another model. However, not all knowledge is certain and correct, especially under adverse conditions. For example, label noise usually leads to less reliable models due to the undesired memorisation [1, 2]. Wrong knowledge misleads the learning rather than helps. This problem can be handled by two aspects: (i) improving the reliability of a model where the knowledge is from (i.e., knowledge source's reliability); (ii) selecting reliable knowledge for distillation. In the literature, making a model more reliable is widely studied while selective MKD receives little attention. Therefore, we focus on studying selective MKD and highlight its importance in this work. Concretely, a generic MKD framework, Confident knowledge selection followed by Mutual Distillation (CMD), is designed. The key component of CMD is a generic knowledge selection formulation, making the selection threshold either static (CMD-S) or progressive (CMD-P). Additionally, CMD covers two special cases: zero knowledge and all knowledge, leading to a unified MKD framework. We empirically find CMD-P performs better than CMD-S. The main reason is that a model's knowledge upgrades and becomes confident as the training progresses. Extensive experiments are present to demonstrate the effectiveness of CMD and thoroughly justify the design of CMD. For example, CMD-P obtains new state-of-the-art results in robustness against label noise.
翻译:共同知识蒸馏(MKD)通过从另一个模型中提取知识来改进模型。然而,并非所有知识都是肯定和正确的,特别是在不利的条件下。例如,标签噪音通常会导致不理想的记忆化[1,2]导致不可靠的模型。错误的知识误导了学习而不是帮助。这个问题可以由两个方面来处理:(一) 提高知识来源(即知识来源的可靠性)的模型的可靠性;(二) 选择可靠的知识进行蒸馏。在文献中,一种模型的可靠性得到广泛研究,而选择性的MKD则很少受到注意。因此,我们侧重于选择性的MKD研究,并强调其在这项工作中的重要性。具体地说,一个通用的MKD框架,在相互蒸馏(CMD)之后的自信知识选择(CMD)选择是通用的知识选择公式,使选择门槛要么静态(CMD-S),要么进步(CMD-P)。此外,CD涵盖两个特殊案例:零知识和所有知识,导致统一的MKD(C-MD)的升级,成为C-MD(C-C-C-C-MD)主要知识的升级,我们发现C-C-MD的升级为C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-G-A-A-G-A-S的升级的升级的改进的升级的升级的升级,成为更好的经验-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-A)。