Graph Neural Networks (GNNs) have shown success in many real-world applications that involve graph-structured data. Most of the existing single-node GNN training systems are capable of training medium-scale graphs with tens of millions of edges; however, scaling them to large-scale graphs with billions of edges remains challenging. In addition, it is challenging to map GNN training algorithms onto a computation node as state-of-the-art machines feature heterogeneous architecture consisting of multiple processors and a variety of accelerators. We propose HyScale-GNN, a novel system to train GNN models on a single-node heterogeneous architecture. HyScale- GNN performs hybrid training which utilizes both the processors and the accelerators to train a model collaboratively. Our system design overcomes the memory size limitation of existing works and is optimized for training GNNs on large-scale graphs. We propose a two-stage data pre-fetching scheme to reduce the communication overhead during GNN training. To improve task mapping efficiency, we propose a dynamic resource management mechanism, which adjusts the workload assignment and resource allocation during runtime. We evaluate HyScale-GNN on a CPU-GPU and a CPU-FPGA heterogeneous architecture. Using several large-scale datasets and two widely-used GNN models, we compare the performance of our design with a multi-GPU baseline implemented in PyTorch-Geometric. The CPU-GPU design and the CPU-FPGA design achieve up to 2.08x speedup and 12.6x speedup, respectively. Compared with the state-of-the-art large-scale multi-node GNN training systems such as P3 and DistDGL, our CPU-FPGA design achieves up to 5.27x speedup using a single node.
翻译:内径网络( GNN) 在许多包含图形结构数据的真实世界应用中表现出成功。 现有的单节 GNN 培训系统大多能够用数千万边缘来培训中度图形; 但是, 将它们推广到具有数十亿边缘的大型图形中仍然具有挑战性。 此外, 将 GNN 培训算法映射到一个计算节点, 作为由多个处理器和各种加速器组成的最高级机器混合结构。 我们提议 HySate- GNNN, 一个在单节速度结构中培训 GNNN 模型的新系统。 HYSAL- GNNNN 进行混合培训, 既利用处理器和加速器来培训模型。 我们的系统设计克服了现有工程的内存规模限制,并优化了将GNNW 用于大型图形图解的升级前系统。 我们提出一个两阶段的数据转换计划, 在 GNNN 培训中, 改进任务绘图效率, 我们提议一个动态资源管理机制, 将C- NNPO- NF 的大型设计流程中, 运行一个大规模C- NG 的 C- NG 设计。</s>