Clustering is a fundamental problem in network analysis that finds closely connected groups of nodes and separates them from other nodes in the graph, while link prediction is to predict whether two nodes in a network are likely to have a link. The definition of both naturally determines that clustering must play a positive role in obtaining accurate link prediction tasks. Yet researchers have long ignored or used inappropriate ways to undermine this positive relationship. In this article, We construct a simple but efficient clustering-driven link prediction framework(ClusterLP), with the goal of directly exploiting the cluster structures to obtain connections between nodes as accurately as possible in both undirected graphs and directed graphs. Specifically, we propose that it is easier to establish links between nodes with similar representation vectors and cluster tendencies in undirected graphs, while nodes in a directed graphs can more easily point to nodes similar to their representation vectors and have greater influence in their own cluster. We customized the implementation of ClusterLP for undirected and directed graphs, respectively, and the experimental results using multiple real-world networks on the link prediction task showed that our models is highly competitive with existing baseline models. The code implementation of ClusterLP and baselines we use are available at https://github.com/ZINUX1998/ClusterLP.
翻译:集群是网络分析中的一个基本问题,它发现密切相连的节点组群,并将其与图表中的其他节点区分开来,而链接预测则是预测一个网络中的两个节点是否可能有一个链接。两者的定义自然地决定,集群组必须在获得准确的链接预测任务方面发挥积极作用。但研究人员长期以来忽视或使用了不适当的方法来破坏这种积极的关系。在本条中,我们建立了一个简单而高效的集群驱动链接预测框架(ClusterLP),目的是直接利用集群结构,以便在非定向图表和定向图表中尽可能准确地获得节点之间的联系。具体地说,我们提议,较容易在具有类似表达矢量的节点和未定向图表中的群点趋势之间建立联系,而定向图表中的节点可以更容易地指向与其代表矢量相似的节点,并在自己的集群中具有更大的影响。我们为未定向和定向的图表分别定制了集点的集点的实施工作,并且利用多个真实世界网络在链接预测任务上获得的实验结果表明,我们的模型与现有基线/NSL的模型具有高度竞争力。