Contrastive learning based on instance discrimination trains model to discriminate different transformations of the anchor sample from other samples, which does not consider the semantic similarity among samples. This paper proposes a new kind of contrastive learning method, named CLIM, which uses positives from other samples in the dataset. This is achieved by searching local similar samples of the anchor, and selecting samples that are closer to the corresponding cluster center, which we denote as center-wise local image selection. The selected samples are instantiated via an data mixture strategy, which performs as a smoothing regularization. As a result, CLIM encourages both local similarity and global aggregation in a robust way, which we find is beneficial for feature representation. Besides, we introduce \emph{multi-resolution} augmentation, which enables the representation to be scale invariant. We reach 75.5% top-1 accuracy with linear evaluation over ResNet-50, and 59.3% top-1 accuracy when fine-tuned with only 1% labels.
翻译:基于实例歧视的对比性学习培训模型来区分锚样与其他样本的不同变异,而其他样本并不考虑样本的语义相似性。 本文提出一种新的对比性学习方法, 名为 CLIM, 使用数据集中其他样本的正数。 这是通过搜索本地相似的锚样样本, 以及选择更接近相应集集中心的样本实现的, 我们用中度本地图像选择来表示。 选中的样本通过数据混合战略即时进行, 以平稳的方式进行规范化。 结果, CLIM 以强健的方式鼓励本地相似性和全球聚合, 我们认为这有利于地貌表现。 此外, 我们引入了 emph{ 多分辨率} 增强值, 使表示值能够具有差异性。 我们通过对 Res- 50 进行线性评价, 达到75.5% 前一至一的精度, 并且只有1%的标签进行微调时达到59. 3%的精度。