The stochastic block model (SBM) is a fundamental model for studying graph clustering or community detection in networks. It has received great attention in the last decade and the balanced case, i.e., assuming all clusters have large size, has been well studied. However, our understanding of SBM with unbalanced communities (arguably, more relevant in practice) is still very limited. In this paper, we provide a simple SVD-based algorithm for recovering the communities in the SBM with communities of varying sizes. Under the KS-threshold conjecture, the tradeoff between the parameters in our algorithm is nearly optimal up to polylogarithmic factors for a wide range of regimes. As a byproduct, we obtain a time-efficient algorithm with improved query complexity for a clustering problem with a faulty oracle, which improves upon a number of previous work (Mazumdarand Saha [NIPS 2017], Larsen, Mitzenmacher and Tsourakakis [WWW 2020], Peng and Zhang[COLT 2021]). Under the KS-threshold conjecture, the query complexity of our algorithm is nearly optimal up to polylogarithmic factors.
翻译:SVD 区块模型(SBM)是研究图集或网络中社区探测的基本模型,在过去十年中受到极大关注,均衡的情况,即假设所有组群规模大,已经进行了很好的研究;然而,我们对SBM与不平衡社区(在实际中可能更加相关)的理解仍然非常有限。在本文中,我们提供了一种简单的SVD算法,用于在SBM与不同规模的社区中恢复社区。在KS-高度阈值预测下,我们算法参数之间的权衡几乎达到多种制度的多元因素的最佳程度。作为副产品,我们获得了一种时间效率高的算法,提高了对有缺陷或骨骼的集聚问题的查询复杂性,这改进了以前的一些工作(Mazumdarand Saha [NIPS 201717]、Larsen、Mitzenmacher和Tsourakakis[WWW2020]、Peng和Zhang[COLT 2021])。在KS-S-S-S-resmet convisional 至我们的最佳模型的复杂程度。