Few-shot learning is challenging due to its very limited data and labels. Recent studies in big transfer (BiT) show that few-shot learning can greatly benefit from pretraining on large scale labeled dataset in a different domain. This paper asks a more challenging question: "can we use as few as possible labels for few-shot learning in both pretraining (with no labels) and fine-tuning (with fewer labels)?". Our key insight is that the clustering of target samples in the feature space is all we need for few-shot finetuning. It explains why the vanilla unsupervised pretraining (poor clustering) is worse than the supervised one. In this paper, we propose transductive unsupervised pretraining that achieves a better clustering by involving target data even though its amount is very limited. The improved clustering result is of great value for identifying the most representative samples ("eigen-samples") for users to label, and in return, continued finetuning with the labeled eigen-samples further improves the clustering. Thus, we propose eigen-finetuning to enable fewer shot learning by leveraging the co-evolution of clustering and eigen-samples in the finetuning. We conduct experiments on 10 different few-shot target datasets, and our average few-shot performance outperforms both vanilla inductive unsupervised transfer and supervised transfer by a large margin. For instance, when each target category only has 10 labeled samples, the mean accuracy gain over the above two baselines is 9.2% and 3.42 respectively.
翻译:少见的学习因其数据和标签非常有限而具有挑战性。 最近在大型传输( BIT) 中的研究显示, 少见的学习会大大受益于在一个不同领域进行大规模标签数据集预培训。 本文提出了一个更具挑战性的问题 : “ 我们能否在预培训( 没有标签) 和微调( 标签较少) 中尽可能少使用标签来进行少见的学习? 我们的关键洞察力是, 在特性空间中将目标样本分组就是我们所需要的微小的微调。 它解释了为什么未受监督的预培训( 低频分组)比受监督的目标要差得多。 在本文件中, 我们提议通过使用目标数据( 尽管其数量非常有限 ) 来实现更好的组合。 改进的集群结果对于识别用户最有代表性的样本( “ egen- samples ” ) 和回报来说, 继续微调标称的精度比值为微的精度精度精度。 它解释了为什么未受监督的预选的预选目标( ) 。 因此, 我们提议对每个目标的精度进行精确的预选的预变校准的预估, 3, 通过利用不同层次的实验, 的实验, 我们的精度将10 的精度 的精度 的精度调整 的精度 的精度, 使得我们的正的导性演化的性演化的性实验 的性实验, 。