The Louvain algorithm is currently one of the most popular community detection methods. This algorithm finds communities by maximizing a quantity called modularity. In this work, we describe a metric space of clusterings, where clusterings are described by a binary vector indexed by the vertex-pairs. We extend this geometry to a hypersphere and prove that maximizing modularity is equivalent to minimizing the angular distance to some modularity vector over the set of clustering vectors. This equivalence allows us to view the Louvain algorithm as a nearest-neighbor search that approximately minimizes the distance to this modularity vector. By replacing this modularity vector by a different vector, many alternative community detection methods can be obtained. We explore this wider class and compare it to existing modularity-based methods. Our experiments show that these alternatives may outperform modularity-based methods. For example, when communities are large compared to vertex neighborhoods, a vector based on numbers of common neighbors outperforms existing community detection methods. While the focus of the present work is community detection in networks, the proposed methodology can be applied to any clustering problem where pair-wise similarity data is available.
翻译:Louvain 算法是目前最受欢迎的社区检测方法之一。 此算法通过最大量的模块化发现社区。 在这项工作中, 我们描述一个群集的衡量空间, 群集由按顶层指数指数的二进制矢量描述。 我们将这一几何方法扩展至超视距, 并证明最大度的模块化程度相当于将角距离降至组群矢量中某些模块化矢量的最小值。 这个等值允许我们将Louvain 算法视为近邻搜索, 以尽可能缩小与模块化矢量的距离。 通过以不同的矢量取代这个模块化矢量, 可以获得许多替代性社区检测方法。 我们探索这个更大的分类, 并将其与现有的模块化方法进行比较。 我们的实验显示, 这些替代方法可能超越基于模块化的方法。 例如, 当群落与顶端群群群群相比大时, 一个基于普通邻居数目的矢量的矢量比现有社区检测方法要好。 虽然当前工作的重点是网络中的社区检测, 提议的方法可以适用于任何组合问题, 。