Designing flexible graph kernels that can run well on various platforms is a crucial research problem due to the frequent usage of graphs for modeling data and recent architectural advances and variety. In this work, we propose a novel graph processing framework, PGAbB (Parallel Graph Algorithms by Blocks), for modern shared-memory heterogeneous platforms. Our framework implements a block-based programming model. This allows a user to express a graph algorithm using kernels that operate on subgraphs. PGAbB support graph computations that fit in host DRAM but not in GPU device memory, and provides simple but effective scheduling techniques to schedule computations to all available resources in a heterogeneous architecture. We have demonstrated that one can easily implement a diverse set of graph algorithms in our framework by developing five algorithms. Our experimental results show that PGAbB implementations achieve better or competitive performance compared to hand-optimized implementations. Based on our experiments on five graph algorithms and forty-four graphs, in the median, PGAbB achieves 1.6, 1.6, 5.7, 3.4, 4.5, and 2.4 times better performance than GAPBS, Galois, Ligra, LAGraph Galois-GPU, and Gunrock graph processing systems, respectively.
翻译:由于经常使用图表进行数据建模,以及最近的建筑进步和多样性,设计能够在各种平台上运行良好的灵活图形内核是一个关键的研究问题。在这项工作中,我们提议了一个新的图表处理框架,即:PGAB(由区块绘制的图解算法),用于现代共享的多式平台。我们的框架采用了一个基于块状的编程模型。这样用户就可以使用在子图上运行的内核来表达一种图表算法。PGAB支持图计算方法适合主机DRAM,但不适合GPU设备记忆,并且提供了简单而有效的排期技术,将计算工作排期排到混合结构中的所有现有资源中。我们已经证明,通过开发五种算法,可以很容易地在我们的框架内执行一套不同的图表算法。我们的实验结果表明,PGAB的实施工作比手式化的实施工作的内核效果更好或竞争性。根据我们用五个图式算法和四十四张图进行的实验,PGAB在中,PGAB实现了1.6、5.7、3.4、4.5、4.5和2.4个GAR-BS、GAR-GRA、G、GAR-G、GAR-GRAPRA、G、GR和G/G/GRBS、GRBS、GRBS、GRBS、GRA、GR、GR、GR、GR、GRB、GRB、GR、G、G/GR、GRB、GR、GR、GR、GR、GR、GRB、GRB、GR、GR、GR、GRRB、GR、GR、GR、G、G、G、G、G、G、G、G、G、G、G、G、G、G、G、G、G、GR、G、G、G、G-G-GR、G、G、G-G-G-G-G-G-G-G-G-G-G-G-G、G、G-G-GAR、GRB、GAR、G、G、G、G、G、G、G、G-G、G、G、G、G、G-G-G、G、G-G、G、G、G、G-