Community detection in graphs that are generated according to stochastic block models (SBMs) has received much attention lately. In this paper, we focus on the binary symmetric SBM -- in which a graph of $n$ vertices is randomly generated by first partitioning the vertices into two equal-sized communities and then connecting each pair of vertices with probability that depends on their community memberships -- and study the associated exact community recovery problem. Although the maximum-likelihood formulation of the problem is non-convex and discrete, we propose to tackle it using a popular iterative method called projected power iterations. To ensure fast convergence of the method, we initialize it using a point that is generated by another iterative method called orthogonal iterations, which is a classic method for computing invariant subspaces of a symmetric matrix. We show that in the logarithmic sparsity regime of the problem, with high probability the proposed two-stage method can exactly recover the two communities down to the information-theoretic limit in $\mathcal{O}(n\log^2n/\log\log n)$ time, which is competitive with a host of existing state-of-the-art methods that have the same recovery performance. We also conduct numerical experiments on both synthetic and real data sets to demonstrate the efficacy of our proposed method and complement our theoretical development.
翻译:在根据区块模型(SBMs)生成的图表中产生的社区检测最近受到很多关注。在本文中,我们侧重于二进制对称 SBM -- -- 在其中,先将脊椎分成两个等大小的群落,然后将每一对脊椎连接起来,概率取决于其社区成员,然后将每对脊椎与取决于其社区成员的概率的概率连接起来,然后研究相关的社区恢复问题。虽然问题的最大相似度配方是非康维克斯和离散的,但我们提议使用一种流行的迭代法,即预测电源迭代法来解决这个问题。为了确保该方法的快速趋同,我们开始使用由另一种迭代法,即所谓的正反迭迭代法迭代法产生的一个点。这是在对称矩阵结构中计算不同子空间的典型方法。我们显示,在问题的对调系统中,提议的两阶段方法很有可能可以完全恢复两个社区在 $\ maphalaloral {O2\\\\\ a commandeviolal rodual produal develditional developmental produmental production) 和我们现有的数据方法的正数方法。