Community detection is a classic problem in network science with extensive applications in various fields. Among numerous approaches, the most common method is modularity maximization. Despite their design philosophy and wide adoption, heuristic modularity maximization algorithms rarely return an optimal partition or anything similar. We propose a specialized algorithm, Bayan, which returns partitions with a guarantee of either optimality or proximity to an optimal partition. At the core of the Bayan algorithm is a branch-and-cut scheme that solves an integer programming formulation of the problem to optimality or approximate it within a factor. We demonstrate Bayan's distinctive accuracy and stability over 21 other algorithms in retrieving ground-truth communities in synthetic benchmarks and node labels in real networks. Bayan is several times faster than open-source and commercial solvers for modularity maximization making it capable of finding optimal partitions for instances that cannot be optimized by any other existing method. Overall, our assessments point to Bayan as a suitable choice for exact maximization of modularity in networks with up to 3000 edges (in their largest connected component) and approximating maximum modularity in larger networks on ordinary computers.
翻译:社区检测是网络科学中的经典问题,在各个领域具有广泛的应用。在众多方法中,最常见的方法是模块化最大化。尽管启发式模块化最大化算法被广泛采用,但它们很少返回最优分区或类似的分区。我们提出了一个专门的算法,拜扬算法,它返回有最优保证或接近最优分区的分区。拜扬算法的核心是分支定界方案,该方案通过解决问题的整数规划形式来实现最优解或近似解。我们展示了Bayan在合成基准测试和实际网络中检索地面真相社区和节点标签时的准确性和稳定性优势。Bayan比开源和商业求解器快几倍,使其能够找到不能通过任何其他现有方法进行优化的实例的最优分区。总体而言,我们的评估指出,Bayan是在普通计算机上精确最大化模块化的合适选择,对于具有最大3000条边(在其最大连接组件中)的网络以及近似最大模块化的较大网络。