This chapter provides a self-contained introduction to the use of Bayesian inference to extract large-scale modular structures from network data, based on the stochastic blockmodel (SBM), as well as its degree-corrected and overlapping generalizations. We focus on nonparametric formulations that allow their inference in a manner that prevents overfitting, and enables model selection. We discuss aspects of the choice of priors, in particular how to avoid underfitting via increased Bayesian hierarchies, and we contrast the task of sampling network partitions from the posterior distribution with finding the single point estimate that maximizes it, while describing efficient algorithms to perform either one. We also show how inferring the SBM can be used to predict missing and spurious links, and shed light on the fundamental limitations of the detectability of modular structures in networks.
翻译:本章提供了关于利用贝叶斯推断从网络数据中提取大规模模块化结构的自包含介绍,基于随机块模型 (SBM),以及其带有修正度和重叠的推广。我们专注于非参数公式,以一种防止过拟合并实现模型选择的方式进行推断。我们讨论了先验选择的方面,特别是如何通过增加贝叶斯层次结构来避免欠拟合,并对比了从后验分布中抽样网络分区与找到最大化后验分布的单点估计之间的差异,同时描述了执行任一操作的高效算法。我们还展示了如何利用推断 SBM 来预测缺失和虚假链接,并阐明了在网络中检测到模块化结构的基本限制。