Graph clustering is the task of partitioning a collection of observed networks into groups of similar networks. Here similarity means networks have a similar structure or graph topology. To this end, a model-based approach is developed, where the networks are modelled by a finite mixture model of stochastic block models. Moreover, a computationally efficient clustering algorithm is developed. The procedure is an agglomerative hierarchical algorithm that maximizes the so-called integrated classification likelihood criterion. The bottom-up algorithm consists of successive merges of clusters of networks. Those merges require a means to match block labels of two stochastic block models to overcome the label-switching problem. This problem is addressed with a new distance measure for the comparison of stochastic block models based on their graphons. The algorithm provides a cluster hierarchy in form of a dendrogram and valuable estimates of all model parameters.
翻译:图形群集是将观测到的网络汇集成类似网络组的任务。 这里相似意味着网络具有类似的结构或图示表层。 为此, 开发了一种基于模型的方法, 网络以随机区块模型的有限混合模型为模型。 此外, 开发了一种计算高效的群集算法。 程序是一种集合式的等级算法, 使所谓的综合分类概率标准最大化。 自下而上的算法由网络群集的相继合并组成。 这些合并需要一种方法来匹配两个随机区块模型的区块标签, 以克服标签切换问题。 这个问题将用新的距离测量标准来解决, 以比较基于其图形的随机区块模型。 算法提供了一种群集等级, 其形式为时速法和所有模型参数的有价值的估计。