Graph partition is a key component to achieve workload balance and reduce job completion time in parallel graph processing systems. Among the various partition strategies, edge partition has demonstrated more promising performance in power-law graphs than vertex partition and thereby has been more widely adopted as the default partition strategy by existing graph systems. The graph edge partition problem, which is to split the edge set into multiple balanced parts to minimize the total number of copied vertices, has been widely studied from the view of optimization and algorithms. In this paper, we study local search algorithms for this problem to further improve the partition results from existing methods. More specifically, we propose two novel concepts, namely adjustable edges and blocks. Based on these, we develop a greedy heuristic as well as an improved search algorithm utilizing the property of the max-flow model. To evaluate the performance of our algorithms, we first provide adequate theoretical analysis in terms of the approximation quality. We significantly improve the previously known approximation ratio for this problem. Then we conduct extensive experiments on a large number of benchmark datasets and state-of-the-art edge partition strategies. The results show that our proposed local search framework can further improve the quality of graph partition by a wide margin.
翻译:图形分割是实现工作量平衡和减少平行图形处理系统中工作完成时间的关键组成部分。 在各种分区战略中, 边缘分割在电源法图形中的表现比顶端分割法表现更加有希望, 并因此被现有图形系统广泛采用为默认分割战略。 图形边缘分割问题, 即将边缘分成多个平衡部分, 以最大限度地减少复制的脊椎总数, 从优化和算法的角度进行了广泛研究。 本文中, 我们研究本地搜索算法来解决这个问题, 以进一步改进现有方法的分区结果。 更具体地说, 我们提出了两个新概念, 即可调整的边缘和区块。 基于这两个战略, 我们开发了贪婪的偏差法, 并改进了使用最大流模型属性的搜索算法。 为了评估我们算法的性能, 我们首先从近距离质量的角度提供了充分的理论分析。 我们大大改进了以前知道的这一问题的近似比率。 然后, 我们对大量的基准数据集和状态边缘分割战略进行广泛的实验。 结果显示, 我们提议的本地搜索框架可以通过宽分区法进一步改进质量。