The industry and academia have proposed many distributed graph processing systems. However, the existing systems are not friendly enough for users like data analysts and algorithm engineers. On the one hand, the programing models and interfaces differ a lot in the existing systems, leading to high learning costs and program migration costs. On the other hand, these graph processing systems are tightly bound to the underlying distributed computing platforms, requiring users to be familiar with distributed computing. To improve the usability of distributed graph processing, we propose a unified distributed graph programming framework UniGPS. Firstly, we propose a unified cross-platform graph programming model VCProg for UniGPS. VCProg hides details of distributed computing from users. It is compatible with the popular graph programming models Pregel, GAS, and Push-Pull. VCProg programs can be executed by compatible distributed graph processing systems without modification, reducing the learning overheads of users. Secondly, UniGPS supports Python as the programming language. We propose an interprocess-communication-based execution environment isolation mechanism to enable Java/C++-based systems to call user-defined methods written in Python. The experimental results show that UniGPS enables users to process big graphs beyond the memory capacity of a single machine without sacrificing usability. UniGPS shows near-linear data scalability and machine scalability.
翻译:行业和学术界提出了许多分布式图解处理系统。然而,现有的系统对于数据分析师和算法工程师等用户来说不够友好。一方面,程序模型和界面在现有系统中差异很大,导致学习成本和程序迁移成本高。另一方面,这些图表处理系统与基本分布式计算平台紧密相连,要求用户熟悉分布式计算。为了提高分布式图处理的可用性,我们建议了一个分布式图表编程框架UniGPS。首先,我们建议为UniGPS提议一个统一的跨平台图形图形编程模型VCProg。 VCProg向用户隐藏分布式计算机的细节。它与流行式图形编程模型Pregel、GAS和Push-Pull相兼容。 VCProg程序可以在不作修改的情况下通过兼容的分布式图处理系统执行,减少用户的学习管理费。第二,UniGPS支持Python作为编程语言。我们提议了一个基于进程/C+++基系统来调用用户定义的方法在Python上写成的VICS。实验性结果显示UIGPS在不易变的系统上,使得UIGPS的系统能够超越了UIGPA系统。