Self-supervised learning has recently shown great potential in vision tasks through contrastive learning which aims to discriminate each image, or instance, in the dataset. However, such instance-level learning ignores the semantic relationship among instances and sometimes undesirably repels the anchor from the semantically similar samples, termed as "false negatives". In this work, we show that the unfavorable effect from false negatives is more significant for the large-scale datasets with more semantic concepts. To address the issue, we propose a novel self-supervised contrastive learning framework that incrementally detects and explicitly removes the false negative samples. Specifically, following the training process, our method dynamically detects increasing high-quality false negatives considering that the encoder gradually improves and the embedding space becomes more semantically structural. Next, we discuss two strategies to explicitly remove the detected false negatives during contrastive learning. Extensive experiments show that our framework outperforms other self-supervised contrastive learning methods on multiple benchmarks in a limited resource setup.
翻译:自我监督的学习最近通过对比学习展示了视觉任务的巨大潜力。 对比学习的目的是在数据集中区分每个图像或实例。 但是,这样的实例级学习忽略了各种实例之间的语义关系,有时不理想地将锚从语义相似的样本中反射出来,称为“ 假阴性 ” 。 在这项工作中, 我们显示假阴性效应对于具有更多语义概念的大型数据集来说更为重要。 为了解决这个问题, 我们提出了一个全新的自我监督对比学习框架, 逐步检测并明确删除假阴性样本。 具体地说, 在培训过程之后, 我们的方法动态地检测出质量更高的假阴性, 因为编码器正在逐渐改善,嵌入的空间在语义结构上变得更强。 接下来, 我们讨论两个战略, 在对比学习过程中明确消除所发现的假阴性。 广泛的实验显示, 我们的框架在有限资源设置的多个基准上比其他自我监督的反向学习方法要强得多。