The community detection problem requires to cluster the nodes of a network into a small number of well-connected "communities". There has been substantial recent progress in characterizing the fundamental statistical limits of community detection under simple stochastic block models. However, in real-world applications, the network structure is typically dynamic, with nodes that join over time. In this setting, we would like a detection algorithm to perform only a limited number of updates at each node arrival. While standard voting approaches satisfy this constraint, it is unclear whether they exploit the network information optimally. We introduce a simple model for networks growing over time which we refer to as streaming stochastic block model (StSBM). Within this model, we prove that voting algorithms have fundamental limitations. We also develop a streaming belief-propagation (StreamBP) approach, for which we prove optimality in certain regimes. We validate our theoretical findings on synthetic and real data.
翻译:社区检测问题要求将网络的节点分组成少数连接良好的“社区” 。 最近,在简单随机区块模型下社区检测的基本统计限制特征的定性方面取得了显著进展。 但是,在现实世界应用中,网络结构通常是动态的,结点会随着时间推移而相联。 在这种环境下,我们希望检测算法只在每个节点到达时只进行数量有限的更新。虽然标准投票方法满足了这一限制,但不清楚它们是否最佳地利用网络信息。我们为不断增长的网络引入了一个简单的模型,我们称之为流动随机区块模型(StSBM )。在这个模型中,我们证明投票算法具有根本性的局限性。我们还开发了一种流动的信仰-调整(StreamBP)方法,我们证明在某些制度中是最佳的。我们验证了关于合成和真实数据的理论结论。