In this paper, we propose the $K$-Shot Contrastive Learning (KSCL) of visual features by applying multiple augmentations to investigate the sample variations within individual instances. It aims to combine the advantages of inter-instance discrimination by learning discriminative features to distinguish between different instances, as well as intra-instance variations by matching queries against the variants of augmented samples over instances. Particularly, for each instance, it constructs an instance subspace to model the configuration of how the significant factors of variations in $K$-shot augmentations can be combined to form the variants of augmentations. Given a query, the most relevant variant of instances is then retrieved by projecting the query onto their subspaces to predict the positive instance class. This generalizes the existing contrastive learning that can be viewed as a special one-shot case. An eigenvalue decomposition is performed to configure instance subspaces, and the embedding network can be trained end-to-end through the differentiable subspace configuration. Experiment results demonstrate the proposed $K$-shot contrastive learning achieves superior performances to the state-of-the-art unsupervised methods.
翻译:在本文中,我们通过应用多倍增来调查单个实例中的样本差异,提出“视觉特征”的“KSCL”建议,通过应用多倍增来调查单个实例中的样本差异。其目的是通过学习区分不同实例的区分性特征,以及内部差异的匹配查询,将不同实例中增加样本的变异性相匹配,从而结合内部差异的优势。特别是,在每种实例中,我们构建了一个实例子空间,以模拟如何组合组合成增量变异的显著因素。在查询中,随后通过将查询投射到其子空间以预测正实例类别,检索到最相关的实例变异性。这概括了现有的对比性学习,可被视为一个特殊的单张案例。一个“egenvalue decomporation”用于配置实例子空间,而嵌入网络可以通过不同的子空间配置接受端对端到端的培训。实验结果显示,拟议的以$$$$对准的对比学习能够达到州非超超超强方法的优性性能。