Contrastive Learning (CL) has emerged as a dominant technique for unsupervised representation learning which embeds augmented versions of the anchor close to each other (positive samples) and pushes the embeddings of other samples (negatives) apart. As revealed in recent studies, CL can benefit from hard negatives (negatives that are most similar to the anchor). However, we observe limited benefits when we adopt existing hard negative mining techniques of other domains in Graph Contrastive Learning (GCL). We perform both experimental and theoretical analysis on this phenomenon and find it can be attributed to the message passing of Graph Neural Networks (GNNs). Unlike CL in other domains, most hard negatives are potentially false negatives (negatives that share the same class with the anchor) if they are selected merely according to the similarities between anchor and themselves, which will undesirably push away the samples of the same class. To remedy this deficiency, we propose an effective method, dubbed \textbf{ProGCL}, to estimate the probability of a negative being true one, which constitutes a more suitable measure for negatives' hardness together with similarity. Additionally, we devise two schemes (i.e., \textbf{ProGCL-weight} and \textbf{ProGCL-mix}) to boost the performance of GCL. Extensive experiments demonstrate that ProGCL brings notable and consistent improvements over base GCL methods and yields multiple state-of-the-art results on several unsupervised benchmarks or even exceeds the performance of supervised ones. Also, ProGCL is readily pluggable into various negatives-based GCL methods for performance improvement. We release the code at \textcolor{magenta}{\url{https://github.com/junxia97/ProGCL}}.
翻译:对比学习(CL)已成为一种不受监督的演示学习的主要技术,这种学习将锁定的扩大版本嵌入彼此之间(正样),并将其他样本(负样)的嵌入分开。正如最近的研究所揭示的那样,CL可以受益于硬负值(最接近于锚的负值)。然而,当我们在图形对比学习(GCL)中采用其它领域现有的硬负采矿技术时,我们观察到的效益有限。我们对这一现象进行实验和理论分析,发现这可以归因于图形神经网络(GNNS)传递信息。与CL不同的是,大多数硬负值可能是虚假的负值(负数与其他样本(负数)一样),如果它们只是根据固定和自身之间的相似性能来选择,这将不值得注意地推走同级的GC的样本。为了弥补这一缺陷,我们提出了一种有效的方法,被调出为平价/textfralf{ProGCL}的改进度可归因于图象性变异的概率, 也代表了正反性L} 性性性性性能的精确性能和性能的推演化性能,也比正基值。