The key towards learning informative node representations in graphs lies in how to gain contextual information from the neighbourhood. In this work, we present a simple-yet-effective self-supervised node representation learning strategy via directly maximizing the mutual information between the hidden representations of nodes and their neighbourhood, which can be theoretically justified by its link to graph smoothing. Following InfoNCE, our framework is optimized via a surrogate contrastive loss, where the positive selection underpins the quality and efficiency of representation learning. To this end, we propose a topology-aware positive sampling strategy, which samples positives from the neighbourhood by considering the structural dependencies between nodes and thus enables positive selection upfront. In the extreme case when only one positive is sampled, we fully avoid expensive neighbourhood aggregation. Our methods achieve promising performance on various node classification datasets. It is also worth mentioning by applying our loss function to MLP based node encoders, our methods can be orders of faster than existing solutions. Our codes and supplementary materials are available at https://github.com/dongwei156/n2n.
翻译:在图表中学习信息节点表示的关键在于如何从周边获取背景信息。在这项工作中,我们通过直接最大限度地增加节点及其周边隐蔽的表达方式之间的相互信息,从理论上讲,这可以通过其图解平滑的链接来证明。在InfoNCE之后,我们的框架通过一种替代的对比性损失而优化,积极的选择是代表学习的质量和效率的基础。为此,我们建议了一个具有正面认识的表层抽样战略,通过考虑节点之间的结构依赖性,从周边取样积极,从而能够提前进行积极的选择。在只采集一个正数的极端情况下,我们完全避免了昂贵的邻里聚合。我们的方法在各种节点分类数据集上取得了有希望的业绩。我们的方法也值得一提的是,将我们的损失功能应用到基于节点的 MLP 计算器中,我们的方法可以比现有的解决方案更快。我们的代码和补充材料可以在https://github.com/dongwei156/n2n上找到。