Traditional methods for link prediction can be categorized into three main types: graph structure feature-based, latent feature-based, and explicit feature-based. Graph structure feature methods leverage some handcrafted node proximity scores, e.g., common neighbors, to estimate the likelihood of links. Latent feature methods rely on factorizing networks' matrix representations to learn an embedding for each node. Explicit feature methods train a machine learning model on two nodes' explicit attributes. Each of the three types of methods has its unique merits. In this paper, we propose SEAL (learning from Subgraphs, Embeddings, and Attributes for Link prediction), a new framework for link prediction which combines the power of all the three types into a single graph neural network (GNN). GNN is a new type of neural network which directly accepts graphs as input and outputs their labels. In SEAL, the input to the GNN is a local subgraph around each target link. We prove theoretically that our local subgraphs also reserve a great deal of high-order graph structure features related to link existence. Another key feature is that our GNN can naturally incorporate latent features and explicit features. It is achieved by concatenating node embeddings (latent features) and node attributes (explicit features) in the node information matrix for each subgraph, thus combining the three types of features to enhance GNN learning. Through extensive experiments, SEAL shows unprecedentedly strong performance against a wide range of baseline methods, including various link prediction heuristics and network embedding methods.
翻译:链接预测的传统方法可分为三大类: 图形结构基于地貌, 潜伏地貌, 和明确的地貌。 图表结构方法利用一些手工制作的节点近距离评分, 例如普通邻居, 来估计连接的可能性 。 隐藏地特征方法依靠将网络的矩阵表达方式乘以因素化, 以学习每个节点的嵌入。 明确地特征方法在两个节点的清晰属性上培养一个机器学习模型。 三种方法都有其独特的优点 。 在本文中, 我们提议 SEAL ( 从子图、 嵌入和链接预测的属性中学习 ), 一个将所有三种类型的节点的能量整合到单一的图形神经网络( GNNN) 。 GNNN是新型的神经网络, 直接接受图表作为输入和输出其标签。 在SEAL, GNNN( GN) 的输入是围绕每个目标链接的本地子图。 我们从理论上证明我们的本地子图也保留了与链接存在相关的大量高阶图形结构特征 。 另一个关键的链接是, 将所有GNNNNNNER 的精度的精度预测方式都无法通过直缩缩缩化的模型, 。