One of the most fundamental problems in network study is community detection. The stochastic block model (SBM) is one popular model with different estimation methods developed with their community detection consistency results unveiled. However, the SBM is restricted by the strong assumption that all nodes in the same community are stochastically equivalent, which may not be suitable for practical applications. We introduce a pairwise covariates-adjusted stochastic block model (PCABM), a generalization of SBM that incorporates pairwise covariate information. We study the maximum likelihood estimates of the coefficients for the covariates as well as the community assignments. It is shown that both the coefficient estimates of the covariates and the community assignments are consistent under suitable sparsity conditions. Spectral clustering with adjustment (SCWA) is introduced to efficiently solve PCABM. Under certain conditions, we derive the error bound of community detection under SCWA and show that it is community detection consistent. In addition, the model selection in terms of the number of communities and the feature selection for the pairwise covariates are investigated, and two corresponding algorithms are proposed. PCABM compares favorably with the SBM or degree-corrected stochastic block model (DCBM) under a wide range of simulated and real networks when covariate information is accessible.
翻译:网络研究的最根本问题之一是社区探测。软盘块模型(SBM)是一种流行的模式,其使用的估计方法各不相同,其社区探测一致性结果也公布。然而,软盘模型受到以下强烈假设的限制:同一社区的所有节点都具有与系统相同的特征,可能不适合实际应用。我们引入了双对式共变调软块模型(PCABM),该模型的通用性,该模型包含双向共变式信息。我们研究了共变数系数和社区任务分配的最大可能性估计值。它表明,共变数和社区任务之间的系数估计值在适当的宽度条件下是一致的。引入了与调整(SCWAWA)相容的组合,以有效解决CABM。在某些条件下,我们从社区探测中得出了错误,并表明社区探测是一致的。此外,对社区数目的模型选择和对配差变量的特征选择进行了调查,并提出了两种相应的算法。在适当的宽度和宽度(在SBMBM或共同度下,可获取的磁度网络是可获取性)的。