Recent studies showed that single-machine graph processing systems can be as highly competitive as cluster-based approaches on large-scale problems. While several out-of-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge shards on disk. Third, we use a compressed edge cache mechanism to fully utilize the available memory of a machine to reduce the amount of disk accesses for edges. Extensive evaluations have shown that GraphMP could outperform existing single-machine out-of-core systems such as GraphChi, X-Stream and GridGraph by up to 51, and can be as highly competitive as distributed graph engines like Pregel+, PowerGraph and Chaos.
翻译:最近的研究显示,单机图解处理系统与大型问题集束处理方法一样具有高度竞争力。虽然已经提出了几个核心图解处理系统和计算模型,但高磁盘I/O间接费用在许多实际情况下会大大降低性能。在本文件中,我们提议GifaMP处理一台机器上的大图分析器。GifaMP用三种技术实现了低磁盘I/O间接费用。首先,我们设计了一个以脊椎为中心的滑动窗口计算模型,以避免在磁盘上读写头。第二,我们提议了一种选择性的时间安排方法,以跳过磁盘上不必要的边缘碎片的装载和处理。第三,我们使用压缩边缘缓存机制充分利用机器的现有记忆,以减少边缘的磁盘存量。广泛的评估表明,GreaphMP可以超过现有的单机出核心系统,如GreamChi、X-Stream和GridGraph, 最多51个,并且可以像Pregel+、PowGraph和Chaos这样分布式的图形发动机一样具有高度竞争力。