与全球关联的少片学习分离蒸馏 (Few-shot Learning with Global Relatedness Decoupled-Distillation)

Despite the success that metric learning based approaches have achieved in few-shot learning, recent works reveal the ineffectiveness of their episodic training mode. In this paper, we point out two potential reasons for this problem: 1) the random episodic labels can only provide limited supervision information, while the relatedness information between the query and support samples is not fully exploited; 2) the meta-learner is usually constrained by the limited contextual information of the local episode. To overcome these problems, we propose a new Global Relatedness Decoupled-Distillation (GRDD) method using the global category knowledge and the Relatedness Decoupled-Distillation (RDD) strategy. Our GRDD learns new visual concepts quickly by imitating the habit of humans, i.e. learning from the deep knowledge distilled from the teacher. More specifically, we first train a global learner on the entire base subset using category labels as supervision to leverage the global context information of the categories. Then, the well-trained global learner is used to simulate the query-support relatedness in global dependencies. Finally, the distilled global query-support relatedness is explicitly used to train the meta-learner using the RDD strategy, with the goal of making the meta-learner more discriminative. The RDD strategy aims to decouple the dense query-support relatedness into the groups of sparse decoupled relatedness. Moreover, only the relatedness of a single support sample with other query samples is considered in each group. By distilling the sparse decoupled relatedness group by group, sharper relatedness can be effectively distilled to the meta-learner, thereby facilitating the learning of a discriminative meta-learner. We conduct extensive experiments on the miniImagenet and CIFAR-FS datasets, which show the state-of-the-art performance of our GRDD method.

翻译：尽管基于衡量的学习方法在少见的学习中取得了成功,但最近的工作揭示了其直观培训模式的无效性。在本文中,我们指出了造成这一问题的两个潜在原因:(1)随机的直观标签只能提供有限的监督信息,而查询和支持样本之间的关联信息则没有得到充分利用;(2) 元输出器通常受到本地插图背景信息有限的制约。为了克服这些问题,我们提议采用一种新的全球关联性脱缩(GRDD)法,使用全球类别知识和相关性分解(RDD)法。我们GRDD通过模仿人类的习惯快速学习新的视觉概念,即从教师的深层次知识中学习。更具体地说,我们首先用类别标签来培训整个基础子组的全球学习者,以利用这些类别的全球背景信息。然后,经过良好培训的全球学习者只能通过模拟全球依赖性变现的多端支持相关关系。最后,与不断淡化的反复变现的精度相关战略的精度, 与不断淡化的精度相关的全球变精度相关战略的精度, 明确使用与再演的精度相关战略, 将精细的精细的精细的精度转化为相关战略的精度用于驱动的精度,使相关的精度相关目标的精度转化为的精度的精度的精度转化为的精度的精度的精度的精度的精度用于。