提高社区探测平行Louvain Algoorithm的效率 (Enhancing Efficiency in Parallel Louvain Algorithm for Community Detection)

Community detection is a key aspect of network analysis, as it allows for the identification of groups and patterns within a network. With the ever-increasing size of networks, it is crucial to have fast algorithms to analyze them efficiently. It is a modularity-based greedy algorithm that divides a network into disconnected communities better over several iterations. Even in big, dense networks, it is renowned for establishing high-quality communities. However it can be at least a factor of ten slower than community discovery techniques that rely on label-propagation, which are generally extremely fast but obtain communities of lower quality. The researchers have suggested a number of methods for parallelizing and improving the Louvain algorithm. To decide which strategy is generally the best fit and which parameter values produce the highest performance without compromising community quality, it is critical to assess the performance and accuracy of these existing approaches. As we implement the single-threaded and multi-threaded versions of the static Louvain algorithm in this report, we carefully examine the method's specifics, make the required tweaks and optimizations, and determine the right parameter values. The tolerance between each pass can be changed to adjust the method's performance. With an initial tolerance of 0.01 and a tolerance decline factor of 10, an asynchronous version of the algorithm produced the best results. Generally speaking, according to our findings, the approach is not well suited for shared-memory parallelism; however, one potential workaround is to break the graph into manageable chunks that can be independently executed and then merged back together.

翻译：社区检测是网络分析的一个关键方面, 因为它允许在网络中识别群落和模式。随着网络规模的不断增加, 关键是拥有快速算法来有效分析这些群落和模式。它是一个基于模块的贪婪算法, 将网络分成不同社区, 与若干迭代相比, 将网络分割为更不相干的社区。即使在大型、密集的网络中, 它也以建立高质量的社区而闻名。但是, 它至少比依靠标签转换的社区发现技术慢10倍, 标签转换一般非常快, 但获得的社群质量却较低。研究人员已经建议了一些平行化和改进卢万算法的方法。要决定哪种战略一般最合适, 以及哪个参数能产生最高绩效, 而不会影响社区质量。评估这些现有方法的性能和准确性。当我们在本报告中实施一次性读和多读的静态Louvain算法版本时, 我们仔细检查该方法的具体性, 使所需的直径直径方法更合适, 然后调整, 并确定正确的参数值值值值值。要决定哪种分数之间的容忍度, 但是, 每一步的直径, 度, 可以调整一个正数的递化的递化的计算结果, 。