Most of the self-supervised representation learning methods are based on the contrastive loss and the instance-discrimination task, where augmented versions of the same image instance ("positives") are contrasted with instances extracted from other images ("negatives"). For the learning to be effective, a lot of negatives should be compared with a positive pair, which is computationally demanding. In this paper, we propose a different direction and a new loss function for self-supervised representation learning which is based on the whitening of the latent-space features. The whitening operation has a "scattering" effect on the batch samples, which compensates the use of negatives, avoiding degenerate solutions where all the sample representations collapse to a single point. Our Whitening MSE (W-MSE) loss does not require special heuristics (e.g. additional networks) and it is conceptually simple. Since negatives are not needed, we can extract multiple positive pairs from the same image instance. We empirically show that W-MSE is competitive with respect to popular, more complex self-supervised methods. The source code of the method and all the experiments is available at https://github.com/htdt/self-supervised.
翻译:大多数自我监督的代言学习方法都是以对比性损失和实例歧视任务为基础,将同一图像实例(“阳性”)的扩大版本与其他图像(“负性”)的扩大版本相对比。为了使学习产生效果,许多负值应当与正对相比较,这是计算上要求的。在本文中,我们提议了一种不同的方向和新的损失功能,用于自我监督的代言学习,这种代言学习以潜空功能的白化为基础。白化操作对批量样本产生了“冲击”效应,弥补了负值的使用,避免了所有样本显示衰落到一个点的退化解决方案。我们的白化 MSE (W-MSE) 损失不需要特别的黑皮(例如额外的网络),而且从概念上来说很简单。由于不需要负值,我们可以从同一图像实例中提取多个正对。我们从经验上表明W-MSE在大众、更复杂的自我监督方法上具有竞争力。我们的方法源代码和所有实验都在 http/ subgrub/ all surviews 上提供。