We perform a systematic analysis of the quality of fit of the stochastic block model (SBM) for 275 empirical networks spanning a wide range of domains and orders of size magnitude. We employ posterior predictive model checking as a criterion to assess the quality of fit, which involves comparing networks generated by the inferred model with the empirical network, according to a set of network descriptors. We observe that the SBM is capable of providing an accurate description for the majority of networks considered, but falls short of saturating all modeling requirements. In particular, networks possessing a large diameter and slow-mixing random walks tend to be badly described by the SBM. However, contrary to what is often assumed, networks with a high abundance of triangles can be well described by the SBM in many cases. We demonstrate that simple network descriptors can be used to evaluate whether or not the SBM can provide a sufficiently accurate representation, potentially pointing to possible model extensions that can systematically improve the expressiveness of this class of models.
翻译:我们系统地分析275个范围广泛、规模大小不等的经验型网络的软块模型(SBM)的适合性质量;我们采用后视预测型模型检查作为评估合适性的标准,其中包括根据一组网络描述器,将推算模型产生的网络与经验型网络进行比较;我们观察到,SBM能够为所考虑的大多数网络提供准确的描述,但不能满足所有建模要求;特别是,具有大直径和慢速混合随机行走的网络往往被SBM描述得不好。然而,与通常假设的相反,许多情况下,高密度三角的网络可以被SBM描述得非常清楚。我们证明,简单的网络标注器可以用来评估SBM能否提供足够准确的表述,并有可能指出能够系统地改进这类模型的表达性的可能的模型扩展。