Structural clustering is one of the most popular graph clustering methods, which has achieved great performance improvement by utilizing GPUs. Even though, the state-of-the-art GPU-based structural clustering algorithm, GPUSCAN, still suffers from efficiency issues since lots of extra costs are introduced for parallelization. Moreover, GPUSCAN assumes that the graph is resident in the GPU memory. However, the GPU memory capacity is limited currently while many real-world graphs are big and cannot fit in the GPU memory, which makes GPUSCAN unable to handle large graphs. Motivated by this, we present a new GPU-based structural clustering algorithm, GPUSCAN++, in this paper. To address the efficiency issue, we propose a new progressive clustering method tailored for GPUs that not only avoid high parallelization costs but also fully exploits the computing resources of GPUs. To address the GPU memory limitation issue, we propose a partition-based algorithm for structural clustering that can process large graphs with limited GPU memory. We conduct experiments on real graphs, and the experimental results demonstrate that our algorithm can achieve up to 168 times speedup compared with the state-of-the-art GPU-based algorithm when the graph can be resident in the GPU memory. Moreover, our algorithm is scalable to handle large graphs. As an example, our algorithm can finish the structural clustering on a graph with 1.8 billion edges using less than 2 GB GPU memory.
翻译:暂无翻译