The stochastic block model is one of the most studied network models for community detection. It is well-known that most algorithms proposed for fitting the stochastic block model likelihood function cannot scale to large-scale networks. One prominent work that overcomes this computational challenge is Amini et al.(2013), which proposed a fast pseudo-likelihood approach for fitting stochastic block models to large sparse networks. However, this approach does not have convergence guarantee, and is not well suited for small- or medium- scale networks. In this article, we propose a novel likelihood based approach that decouples row and column labels in the likelihood function, which enables a fast alternating maximization; the new method is computationally efficient, performs well for both small and large scale networks, and has provable convergence guarantee. We show that our method provides strongly consistent estimates of the communities in a stochastic block model. As demonstrated in simulation studies, the proposed method outperforms the pseudo-likelihood approach in terms of both estimation accuracy and computation efficiency, especially for large sparse networks. We further consider extensions of our proposed method to handle networks with degree heterogeneity and bipartite properties.
翻译:随机区块模型是研究最多的社区检测网络模型之一,众所周知,为安装随机区块模型概率功能而提出的大多数算法无法向大型网络推广。克服这一计算挑战的一项突出工作是Amini等人(2013年),其中提出了将随机区块模型与大型稀疏网络相匹配的快速假象方法。然而,这种方法没有趋同保证,也不适合小型或中型网络。在本条中,我们提出了一种基于可能性的新的可能性方法,在概率函数中分离行和列标签,使快速交替最大化成为可能;新方法具有计算效率,对小型和大型网络都运作良好,并具有可变的趋同保证。我们表明,我们的方法提供了一种非常一致的对大型随机区块模型中的社区的估计。如模拟研究所示,拟议方法在估计精度和计算效率方面,特别是在大型稀疏网络中,都超越了伪似方法。我们进一步考虑扩大我们拟议的处理网络的方法,使其具有双方特性和双方特性。