This paper studies the problem of conducting self-supervised learning for node representation learning on graphs. Most existing self-supervised learning methods assume the graph is homophilous, where linked nodes often belong to the same class or have similar features. However, such assumptions of homophily do not always hold in real-world graphs. We address this problem by developing a decoupled self-supervised learning (DSSL) framework for graph neural networks. DSSL imitates a generative process of nodes and links from latent variable modeling of the semantic structure, which decouples different underlying semantics between different neighborhoods into the self-supervised learning process. Our DSSL framework is agnostic to the encoders and does not need prefabricated augmentations, thus is flexible to different graphs. To effectively optimize the framework, we derive the evidence lower bound of the self-supervised objective and develop a scalable training algorithm with variational inference. We provide a theoretical analysis to justify that DSSL enjoys the better downstream performance. Extensive experiments on various types of graph benchmarks demonstrate that our proposed framework can achieve better performance compared with competitive baselines.
翻译:本文研究在图表上进行节点代表学习的自我监督学习问题。 多数现有的自监督学习方法假定该图是同质的, 链接的节点通常属于同一类或具有相似的特征。 但是, 单质的假设并不总是在真实世界的图表中持有。 我们通过为图形神经网络开发一个脱钩的自监督学习框架来解决这个问题。 DSSL 仿佛一个结点和链接的基因化过程,它来自语义结构的潜在变异模型,它将不同社区之间的不同基本语义分解为自监督学习过程。 我们的 DSSL 框架对编码者来说是不可知的,不需要预先制造的增强,因此对不同的图表是灵活的。 为了有效地优化框架,我们从自我控制的目标中获取了较低的证据,并且开发了一种具有变异推法的可缩的培训算法。 我们提供了理论分析,以证明DSSL拥有更好的下游性业绩。 在各种类型的图表基准上进行广泛的实验, 显示我们提议的框架能够实现更好的下游业绩。