We tackle the long-tailed visual recognition problem from the knowledge distillation perspective by proposing a Distill the Virtual Examples (DiVE) method. Specifically, by treating the predictions of a teacher model as virtual examples, we prove that distilling from these virtual examples is equivalent to label distribution learning under certain constraints. We show that when the virtual example distribution becomes flatter than the original input distribution, the under-represented tail classes will receive significant improvements, which is crucial in long-tailed recognition. The proposed DiVE method can explicitly tune the virtual example distribution to become flat. Extensive experiments on three benchmark datasets, including the large-scale iNaturalist ones, justify that the proposed DiVE method can significantly outperform state-of-the-art methods. Furthermore, additional analyses and experiments verify the virtual example interpretation, and demonstrate the effectiveness of tailored designs in DiVE for long-tailed problems.
翻译:我们从知识蒸馏的角度提出“蒸馏虚拟实例”方法,从虚拟实例的角度解决长期的目视识别问题。 具体地说,通过将教师模型的预测作为虚拟实例处理,我们证明从这些虚拟实例中提炼等同于在某些限制下进行标签分配学习。 我们表明,当虚拟实例分布比原始输入分布更受欢迎时,代表不足的尾品类将获得显著改进,这对于长尾品识别至关重要。提议的“Dive”方法可以明确调和虚拟实例分布,使之变得平坦。 包括大型iNatulist在内的三个基准数据集的广泛实验证明,提议的“Dive”方法可以大大超越最新技术方法。 此外,更多的分析和实验可以验证虚拟实例解释,并展示Dive设计在长尾品问题上的效果。