Spectral clustering is a popular method for community detection in network graphs: starting from a matrix representation of the graph, the nodes are clustered on a low dimensional projection obtained from a truncated spectral decomposition of the matrix. Estimating correctly the number of communities and the dimension of the reduced latent space is critical for good performance of spectral clustering algorithms. Furthermore, many real-world graphs, such as enterprise computer networks studied in cyber-security applications, often display heterogeneous within-community degree distributions. Such heterogeneous degree distributions are usually not well captured by standard spectral clustering algorithms. In this article, a novel spectral clustering algorithm is proposed for community detection under the degree-corrected stochastic blockmodel. The proposed method is based on a transformation of the spectral embedding to spherical coordinates, and a novel modelling assumption in the transformed space. The method allows for simultaneous and automated selection of the number of communities and the latent dimension for spectral embeddings of graphs with uneven node degrees. Results show improved performance over competing methods in representing computer networks.
翻译:在网络图中,光谱聚集是一种常见的社区探测方法:从图的矩阵表示开始,节点集中在从矩阵的短径光谱分解中获得的低维投影上。正确估计社区的数量和潜伏空间的维度对于光谱集算法的良好运行至关重要。此外,许多真实世界的图象,如在网络安全应用中研究的企业计算机网络,往往显示社区内部的分布差异性。标准光谱集算法通常无法很好地捕捉到这种差异度分布。在本篇文章中,提议了一种新的光谱集成算法,用于在度校正的区块模型下进行社区探测。拟议方法的基础是光谱嵌入到球形坐标的转换,以及改变后的空间的新建模假设。该方法允许同步和自动地选择社区数量以及具有不均衡节点度的光谱嵌入图的潜在维度。结果显示,在代表计算机网络的相竞方法方面,其性得到改进。