Community detection is a classic problem in network science with extensive applications in various fields. The most commonly used methods are the algorithms designed to maximize a utility function, modularity, across different ways that a network can be partitioned into communities. Despite their name and design philosophy, current modularity maximization algorithms generally fail to maximize modularity or guarantee any proximity to an optimal solution. We propose the Bayan algorithm which, unlike the existing methods, returns network partitions with a guarantee of either optimality or proximity to an optimal solution. At the core of the Bayan algorithm is a branch-and-cut scheme that solves a sparse integer programming formulation of the modularity maximization problem to optimality or approximate it within a factor. We analyze the performance of Bayan against 22 existing algorithms using synthetic and real networks. Through extensive experiments, we demonstrate Bayan's distinctive capabilities not only in maximizing modularity, but more importantly in accurate retrieval of ground-truth communities. Bayan's comparative level of performance remains stable over variations in the amount of noise in the data (graph) generation process. The performance of Bayan as an exact modularity maximization algorithm also reveals the theoretical capability limits of maximum-modularity partitions in accurate retrieval of communities. Overall our analysis points to Bayan as a suitable choice for a methodologically grounded detection of communities through exact (approximate) maximization of modularity in networks with up to $\sim10^3$ edges (and larger networks). Prospective advances in graph optimization and integer programming can push these limits further.
翻译:社区探测是网络科学中一个典型的问题,在各个领域都有广泛的应用。最常用的方法是旨在最大限度地扩大一个公用事业功能的算法,即模块化,以各种方式将网络分割成社区。尽管其名称和设计哲学,但目前的模块化最大化算法通常无法最大限度地扩大模块化,或保证任何接近最佳解决办法。我们建议采用巴扬算法,与现有方法不同,该算法返回网络分区,保证其最佳性或接近于最佳解决办法。在巴扬算法的核心是分流和切分法,旨在解决模块化最大化问题的零散整数编程配方,使之达到最佳性或在一个因素范围内接近。我们用合成和真实网络对照22个现有算法分析巴扬的绩效。我们通过广泛的实验,不仅在最大模块化方面,而且更重要的是在准确检索地面圈方面表现出独特的能力。巴扬的相对性业绩水平仍然稳定,因为数据(绘图)生成过程中的噪声量变化幅度更大。巴扬作为精确的模块化最小性推进度 3 和最精确性精确性网络的精确性精确性计算能力极限,通过最精确的精确的深度分析,可以显示这些精确的精确性分析社区。