Graph contrastive learning (GCL) emerges as the most representative approach for graph representation learning, which leverages the principle of maximizing mutual information (InfoMax) to learn node representations applied in downstream tasks. To explore better generalization from GCL to downstream tasks, previous methods heuristically define data augmentation or pretext tasks. However, the generalization ability of GCL and its theoretical principle are still less reported. In this paper, we first propose a metric named GCL-GE for GCL generalization ability. Considering the intractability of the metric due to the agnostic downstream task, we theoretically prove a mutual information upper bound for it from an information-theoretic perspective. Guided by the bound, we design a GCL framework named InfoAdv with enhanced generalization ability, which jointly optimizes the generalization metric and InfoMax to strike the right balance between pretext task fitting and the generalization ability on downstream tasks. We empirically validate our theoretical findings on a number of representative benchmarks, and experimental results demonstrate that our model achieves state-of-the-art performance.
翻译:图表对比式学习(GCL)是最有代表性的图形代表学习方法,它利用了尽量扩大相互信息(InfoMax)的原则来学习下游任务中应用的节点表示。为了更好地探讨从GCL到下游任务的概括化,以往的方法是超自然的界定数据扩增或托辞任务。然而,GCL的概括化能力及其理论原则的报告仍然较少。在本文件中,我们首先为GCL的概括化能力提出了一个称为GCL-GE的衡量指标。考虑到由于不可知的下游任务,衡量指标的可忽略性,我们理论上证明,从信息理论角度看,衡量指标是相互的上限。我们受约束地设计了一个名为InfoAdv的GCL框架,以强化的概括化能力为基础,共同优化通用指标和InfoMax的正确平衡了托辞任务与下游任务的一般化能力。我们从经验上验证了我们关于若干具有代表性的基准的理论结论,实验结果表明,我们的模型达到了最先进的业绩。