Higher-order structures of networks, namely, small subgraphs of networks (also called network motifs), are widely known to be crucial and essential to the organization of networks. There has been a few work studying the community detection problem -- a fundamental problem in network analysis, at the level of motifs. In particular, higher-order spectral clustering has been developed, where the notion of motif adjacency matrix is introduced as the input of the algorithm. However, it remains largely unknown that how higher-order spectral clustering works and when it performs better than its edge-based counterpart. To elucidate these problems, we investigate higher-order spectral clustering from a statistical perspective. In particular, we theoretically study the clustering performance of higher-order spectral clustering under a weighted stochastic block model and compare the resulting bounds with the corresponding results of edge-based spectral clustering. It turns out that when the network is dense with weak signal of weights, higher-order spectral clustering can really lead to the performance gain in clustering. We also use simulations and real data experiments to support the findings.
翻译:网络的高阶结构,即网络中的小子图(也称为网络模体),被广泛认为对网络的组织至关重要。已经有一些工作研究了社区检测问题,在模体的层面上。特别是,高阶谱聚类已经被开发出来,其中引入了模体邻接矩阵作为算法的输入。然而,目前仍然不清楚高阶谱聚类的工作原理以及何时表现得比其基于边的对应物更好。为了阐明这些问题,我们从统计角度研究了高阶谱聚类。特别地,我们在加权随机块模型下理论研究了高阶谱聚类的聚类性能,并将结果与基于边的谱聚类的相应结果进行了比较。结果表明,当网络密度较大且权重信号较弱时,高阶谱聚类确实可以提高聚类性能。我们还使用模拟和真实数据实验证明了这些发现。