Despite its outstanding performance in various graph tasks, vanilla Message Passing Neural Network (MPNN) usually fails in link prediction tasks, as it only uses representations of two individual target nodes and ignores the pairwise relation between them. To capture the pairwise relations, some models add manual features to the input graph and use the output of MPNN to produce pairwise representations. In contrast, others directly use manual features as pairwise representations. Though this simplification avoids applying a GNN to each link individually and thus improves scalability, these models still have much room for performance improvement due to the hand-crafted and unlearnable pairwise features. To upgrade performance while maintaining scalability, we propose Neural Common Neighbor (NCN), which uses learnable pairwise representations. To further boost NCN, we study the unobserved link problem. The incompleteness of the graph is ubiquitous and leads to distribution shifts between the training and test set, loss of common neighbor information, and performance degradation of models. Therefore, we propose two intervention methods: common neighbor completion and target link removal. Combining the two methods with NCN, we propose Neural Common Neighbor with Completion (NCNC). NCN and NCNC outperform recent strong baselines by large margins. NCNC achieves state-of-the-art performance in link prediction tasks. Our code is available at https://github.com/GraphPKU/NeuralCommonNeighbor.
翻译:----
尽管纯消息传递神经网络(MPNN)在各种图任务中表现出色,但通常在链接预测任务中失败,因为它仅使用两个单独目标节点的表示,并忽略它们之间的成对关系。为了捕捉成对关系,一些模型向输入图中添加手工特征,并使用MPNN的输出来产生成对表示。相反,其他模型直接使用手工特征作为成对表示。虽然这种简化避免了将GNN应用于每个链接,从而提高了可扩展性,但这些模型仍有很大的性能提升空间,因为手工制作的和不可学习的成对特征。为了提高性能,同时保持可扩展性,我们提出了神经公共邻居(NCN),它使用可学习的成对表示。为进一步提升NCN的性能,我们研究了未观测到的链接问题。图的不完整性是普遍存在的,并导致训练集和测试集之间的分布改变,共同邻居信息的丢失和模型性能的下降。因此,我们提出了两种干预方法:共同邻居补全和目标链接删除。结合NCN的这两种方法,我们提出了神经公共邻居和补全(NCNC)。NCN和NCNC大幅超越了最近的强基线。NCNC在链接预测任务中实现了最先进的性能。我们的代码可在https://github.com/GraphPKU/NeuralCommonNeighbor上找到。