This paper evaluates eight parallel graph processing systems: Hadoop, HaLoop, Vertica, Giraph, GraphLab (PowerGraph), Blogel, Flink Gelly, and GraphX (SPARK) over four very large datasets (Twitter, World Road Network, UK 200705, and ClueWeb) using four workloads (PageRank, WCC, SSSP and K-hop). The main objective is to perform an independent scale-out study by experimentally analyzing the performance, usability, and scalability (using up to 128 machines) of these systems. In addition to performance results, we discuss our experiences in using these systems and suggest some system tuning heuristics that lead to better performance.
翻译:本文评估了8个平行图表处理系统:Hadoop、Haloop、Vertica、Giraph、GreaphLab(Power Graph)、Blogel、Flink Gelly和GreagX(SPARK)等8个平行图形处理系统,覆盖了4个非常庞大的数据集(Twitter、World Road Network、UK 200705和ClueWeb),使用了4个工作量(PageRank、WCC、SSSP和K-Hop),主要目的是通过实验性分析这些系统的性能、可用性和可扩缩性(使用多达128台机器)来进行独立的扩大规模研究。 除了绩效结果外,我们还讨论了我们在使用这些系统方面的经验,并提出一些能够导致更好性能的系统调重技术。