Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC). However, existing GCL methods are generally transferred from other fields like CV or NLP, whose underlying working mechanism remains under-explored. In this work, we first deeply probe the working mechanism of GCL in SSNC, and find that the promotion brought by GCL is severely unevenly distributed: the improvement mainly comes from subgraphs with less annotated information, which is fundamentally different from contrastive learning in other fields. However, existing GCL methods generally ignore this uneven distribution of annotated information and apply GCL evenly to the whole graph. To remedy this issue and further improve GCL in SSNC, we propose the Topology InFormation gain-Aware Graph Contrastive Learning (TIFA-GCL) framework that considers the annotated information distribution across graph in GCL. Extensive experiments on six benchmark graph datasets, including the enormous OGB-Products graph, show that TIFA-GCL can bring a larger improvement than existing GCL methods in both transductive and inductive settings. Further experiments demonstrate the generalizability and interpretability of TIFA-GCL.
翻译:对比图形学习(GCL)在促进半强化节点分类(SSNC)的绩效方面证明非常有效,但是,现有的GCL方法一般是从CV或NLP等其他领域转让的,而CV或NLP等基本工作机制仍未得到充分探讨。在这项工作中,我们首先深入探究GCL在SSNC的工作机制,发现GCL带来的提升分布严重不均:改进主要来自附加说明的信息较少的子集,这与其他领域的对比性学习有根本的不同。然而,现有的GCL方法一般忽视了附加说明信息的这种不均衡分布,并将GCL均衡地应用于整个图表。为了纠正这一问题并进一步改善SSNCNC的GCL,我们提出了地形学促进增长模型对比学习(TIFA-GCL)框架,该框架考虑了GCL图示的附加说明信息分布,在6个基准图表数据集上进行了广泛的实验,包括巨大的OGB-Producs图,表明TIA-GCL能够比现有的GCL方法在感化和感化环境中的通用可变性试验和感化性。