Contrastive learning (CL) has emerged as a dominant technique for unsupervised representation learning which embeds augmented versions of the anchor close to each other (positive samples) and pushes the embeddings of other samples (negative samples) apart. As revealed in recent works, CL can benefit from hard negative samples (negative samples that are difficult to distinguish from the anchor). However, we observe minor improvement or even performance drop when we adopt existing hard negative mining techniques in Graph Contrastive Learning (GCL). We find that many hard negative samples similar to anchor point are false negative ones (samples from the same class as anchor point) in GCL, which is different from CL in computer vision and will lead to unsatisfactory performance of existing hard negative mining techniques in GCL. To eliminate this bias, we propose Debiased Graph Contrastive Learning (DGCL), a novel and effective method to estimate the probability whether each negative sample is true or not. With this probability, we devise two schemes (i.e., DGCL-weight and DGCL-mix) to boost the performance of GCL. Empirically, DGCL outperforms or matches previous unsupervised state-of-the-art results on several benchmarks and even exceeds the performance of supervised ones.
翻译:对比性学习(CL)已成为一种不受监督的模拟学习的主导技术,这种学习将强化型的锚紧紧紧紧紧紧紧紧紧紧紧地嵌入(正样),并将其他样品(负样)的嵌入分开。正如最近的工作所揭示的那样,CL可以从硬性负抽样(难以区别于锚的负样)中受益。然而,当我们在图表对比性学习中采用现有的硬性负式采矿技术时,我们观察到了一些微小的改进甚至性能下降。我们发现,在GCL中,许多与锚点相似的硬性负式样品是虚假的负式样品(与锚点同一类的样品),在计算机视觉上不同于CL,并且将导致GCL现有硬性负式采矿技术的不令人满意性能。为了消除这种偏差,我们建议采用偏向式的图形对比学习(DGCL),这是一种新而有效的方法,用以估计每个负式样品是否真实的概率。我们想出两种办法(即DGCL重量和DGCL-M-mix)是假的负式的负面性,甚至比照了GCL的以往标准。