Unsupervised contrastive learning has achieved outstanding success, while the mechanism of contrastive loss has been less studied. In this paper, we concentrate on the understanding of the behaviours of unsupervised contrastive loss. We will show that the contrastive loss is a hardness-aware loss function, and the temperature $\tau$ controls the strength of penalties on hard negative samples. The previous study has shown that uniformity is a key property of contrastive learning. We build relations between the uniformity and the temperature $\tau$. We will show that uniformity helps the contrastive learning to learn separable features, however excessive pursuit to the uniformity makes the contrastive loss not tolerant to semantically similar samples, which may break the underlying semantic structure and be harmful to the formation of features useful for downstream tasks. This is caused by the inherent defect of the instance discrimination objective. Specifically, instance discrimination objective tries to push all different instances apart, ignoring the underlying relations between samples. Pushing semantically consistent samples apart has no positive effect for acquiring a prior informative to general downstream tasks. A well-designed contrastive loss should have some extents of tolerance to the closeness of semantically similar samples. Therefore, we find that the contrastive loss meets a uniformity-tolerance dilemma, and a good choice of temperature can compromise these two properties properly to both learn separable features and tolerant to semantically similar samples, improving the feature qualities and the downstream performances.
翻译:未经监督的对比性学习取得了杰出的成功,而对比性损失机制的研究则较少。在本文中,我们集中关注对未经监督的对比性损失行为的理解。我们将表明,对比性损失是一种硬性认知损失功能,而温度美元则控制了对硬性负面抽样的惩罚力度。上一份研究显示,统一性是对比性学习的关键属性。我们在统一性和温度之间建立起了关系。我们将表明,统一性有助于对比性学习学习分解特征,无论对一致性的过度追求如何,使对比性损失不容许对语义相似的样本进行容忍。我们将表明,对比性损失是一种潜在的语义性损失,可能打破基本的语义意识损失结构,有害于下游任务特征的形成。这是实例歧视目标固有的缺陷所造成。具体地说,歧视目标试图将所有不同的情形分开,忽视样本之间的根本关系。将自定义性一致性的样本分开,对于获得一般下游任务之前的知情性能不会产生积极的影响。经过周密的对比性损失应该具有某种程度的对比性损失,而我们则会发现,这种容忍性的下游性特征与我们之间能够正确理解。