图图Theta:分布式图表神经网络学习系统,具有灵活培训战略 (GraphTheta: A Distributed Graph Neural Network Learning System With Flexible Training Strategy)

Graph neural networks (GNNs) have been demonstrated as a powerful tool for analyzing non-Euclidean graph data. However, the lack of efficient distributed graph learning systems severely hinders applications of GNNs, especially when graphs are big and GNNs are relatively deep. Herein, we present GraphTheta, the first distributed and scalable graph learning system built upon vertex-centric distributed graph processing with neural network operators implemented as user-defined functions. This system supports multiple training strategies and enables efficient and scalable big-graph learning on distributed (virtual) machines with low memory. To facilitate graph convolutions, GraphTheta puts forward a new graph learning abstraction named NN-TGAR to bridge the gap between graph processing and graph deep learning. A distributed graph engine is proposed to conduct the stochastic gradient descent optimization with a hybrid-parallel execution, and a new cluster-batched training strategy is supported. We evaluate GraphTheta using several datasets with network sizes ranging from small-, modest- to large-scale. Experimental results show that GraphTheta can scale well to 1,024 workers for training an in-house developed GNN on an industry-scale Alipay dataset of 1.4 billion nodes and 4.1 billion attributed edges, with a cluster of CPU virtual machines (dockers) of small memory each (5$\sim$12GB). Moreover, GraphTheta can outperform DistDGL by up to $2.02\times$, with better scalability, and GraphLearn by up to $30.56\times$. As for model accuracy, GraphTheta is capable of learning as good GNNs as existing frameworks. To the best of our knowledge, this work presents the largest edge-attributed GNN learning task in the literature.

翻译：显示为分析非 Euclidea 图形数据的强大工具。然而, 缺少高效分布式图形学习系统严重妨碍了 GNN 的应用, 特别是当图形大, GNN 相对深。在这里, 我们展示了以 Ever12cent 中心分布式图处理为基础的第一个分布和可缩放的图形学习系统GapTheta, 这是第一个分布式图学习系统, 由神经网络操作员作为用户定义功能实施。这个系统支持多种培训战略, 并使得在存储量较低的分布式(虚拟)机器上高效和可缩放型大图表学习。为促进图形正统( 图表 ) 数字化的图形化图形学习系统, 特别是当图形处理和图形深层学习之间有差距时。我们提出一个分布式图表引擎, 以混合式平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方。