As a fundamental issue in network analysis, structural node similarity has received much attention in academia and is adopted in a wide range of applications. Among these proposed structural node similarity measures, role similarity stands out because of satisfying several axiomatic properties including automorphism conformation. Existing role similarity metrics cannot handle top-k queries on large real-world networks due to the high time and space cost. In this paper, we propose a new role similarity metric, namely \textsf{ForestSim}. We prove that \textsf{ForestSim} is an admissible role similarity metric and devise the corresponding top-k similarity search algorithm, namely \textsf{ForestSimSearch}, which is able to process a top-k query in $O(k)$ time once the precomputation is finished. Moreover, we speed up the precomputation by using a fast approximate algorithm to compute the diagonal entries of the forest matrix, which reduces the time and space complexity of the precomputation to $O(\epsilon^{-2}m\log^5{n}\log{\frac{1}{\epsilon}})$ and $O(m\log^3{n})$, respectively. Finally, we conduct extensive experiments on 26 real-world networks. The results show that \textsf{ForestSim} works efficiently on million-scale networks and achieves comparable performance to the state-of-art methods.
翻译:作为网络分析的一个根本问题,结构节点相似性在学术界引起了很大的注意,并被广泛应用。在这些拟议的结构节点相似性措施中,角色相似性之所以突出,是因为满足了包括自定义性一致性在内的若干不言而喻的特性。由于时间和空间成本高,现有的作用相似度度指标无法处理大型真实世界网络的顶级查询。此外,我们在此文件中提出一个新的类似度量标准,即\ textsf{ForestSim}。我们证明,\ textsf{ForestSim}是一个可接受的作用相似度量度,并设计了相应的最高类似性能搜索算法,即\ textsf{ForestSimSearch},它一旦计算完成,就能够用美元(k)时间处理大型真实世界网络的顶级查询。此外,我们用快速的算法来计算森林矩阵的底部条目。 这使得国家前置时间和空间复杂性降低到$(O___________________BAR_BAR_BAR_O_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_C_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_____BAR_________________________________________________________________________________________________________________________________________________________________________________________________