There is a growing need for distributed graph processing systems that are capable of gracefully scaling to very large graph datasets. Unfortunately, this challenge has not been easily met due to the intense memory pressure imposed by process-centric, message passing designs that many graph processing systems follow. Pregelix is a new open source distributed graph processing system that is based on an iterative dataflow design that is better tuned to handle both in-memory and out-of-core workloads. As such, Pregelix offers improved performance characteristics and scaling properties over current open source systems (e.g., we have seen up to 15x speedup compared to Apache Giraph and up to 35x speedup compared to distributed GraphLab), and makes more effective use of available machine resources to support Big(ger) Graph Analytics.
翻译:日益需要分布式图解处理系统,这些系统能够优雅地推广到非常大的图表数据集中,但不幸的是,由于许多图解处理系统所遵循的以过程为中心的电文传递设计带来的强烈记忆压力,这项挑战并不容易应对。Pregelix是一个新的开放源码分布式图处理系统,它以迭代数据流设计为基础,更适合处理模拟和核心外工作量。因此,Pregelix提供了更好的性能特征,并扩大了现有开放源系统(例如,我们看到与阿帕奇·希拉夫相比,速度高达15x加速,与分布式图拉布相比,速度高达35x)的特性,并更有效地利用现有机器资源支持大(ger)图表分析。