The dynamic scaling of distributed computations plays an important role in the utilization of elastic computational resources, such as the cloud. It enables the provisioning and de-provisioning of resources to match dynamic resource availability and demands. In the case of distributed graph processing, changing the number of the graph partitions while maintaining high partitioning quality imposes serious computational overheads as typically a time-consuming graph partitioning algorithm needs to execute each time repartitioning is required. In this paper, we propose a dynamic scaling method that can efficiently change the number of graph partitions while keeping its quality high. Our idea is based on two techniques: preprocessing and very fast edge partitioning, called graph edge ordering and chunk-based edge partitioning, respectively. The former converts the graph data into an ordered edge list in such a way that edges with high locality are closer to each other. The latter immediately divides the ordered edge list into an arbitrary number of high-quality partitions. The evaluation with the real-world billion-scale graphs demonstrates that our proposed approach significantly reduces the repartitioning time, while the partitioning quality it achieves is on par with that of the best existing static method.
翻译:分布式计算法的动态缩放在利用弹性计算资源(如云)方面起着重要作用。 它使得提供和减少资源以匹配动态资源可用性和需求。 在分布式图表处理中, 改变图形分割区的数量, 同时保持高分隔质量, 将要求严格的计算间接费用, 通常需要花费时间的图形分割算法来执行每次重新分割。 在本文件中, 我们提议一种动态缩放方法, 可以有效改变图形分割区的数量, 同时保持其质量。 我们的想法基于两种技术: 预处理和非常快速边缘分割, 被称为图形边缘排序和块块边缘分割。 前者将图形数据转换成有条边框的边缘列表, 其方式使高地理位置的边缘更加接近对方。 后者立即将订购的边缘列表分割为任意的高质量分割区数。 与真实世界亿级的图表相比, 评估表明我们提议的配置方法大大缩短了重新划分时间, 而它所实现的分隔质量与现有最佳静态方法相同。