Despite the recent success of Graph Neural Networks (GNNs), it remains challenging to train a GNN on large graphs, which are prevalent in various applications such as social network, recommender systems, and knowledge graphs. Traditional sampling-based methods accelerate GNN by dropping edges and nodes, which impairs the graph integrity and model performance. Differently, distributed GNN algorithms, which accelerate GNN training by utilizing multiple computing devices, can be classified into two types: "partition-based" methods enjoy low communication costs but suffer from information loss due to dropped edges, while "propagation-based" methods avoid information loss but suffer prohibitive communication overhead. To jointly address these problems, this paper proposes DIstributed Graph Embedding SynchronizaTion (DIGEST), a novel distributed GNN training framework that synergizes the complementary strength of both categories of existing methods. During subgraph parallel training, we propose to let each device store the historical embedding of its neighbors in other subgraphs. Therefore, our method does not discard any neighbors in other subgraphs, nor does it updates them intensively. This effectively avoids (1) the intensive computation on explosively-increasing neighbors and (2) excessive communications across different devices. We proved that the approximation error induced by the staleness of historical embedding can be upper bounded and it does NOT affect the GNN model's expressiveness. More importantly, our convergence analysis demonstrates that DIGEST enjoys a state-of-the-art convergence rate. Extensive experimental evaluation on large, real-world graph datasets shows that DIGEST achieves up to $21.82\times$ speedup without compromising the performance compared to state-of-the-art distributed GNN training frameworks.
翻译:尽管Great Neal网络(GNN)最近取得了成功,但在大型图表上培训GNN仍然具有挑战性,因为大型图表在社交网络、推荐人系统和知识图表等各种应用中十分普遍。基于传统抽样的方法通过下降边缘和节点加速GNN,这损害了图形的完整性和模型性能。不同的是,分布式GNN算法,通过使用多种计算设备加速GNN培训,可以分为两类:“基于分区的”方法享有较低的通信成本,但因边缘下降而蒙受信息趋同损失。而“基于调整的”方法避免信息损失,但承受着令人无法接受的通信管理。为了共同解决这些问题,本文提议采用基于传统抽样的方法加速GNNNNN,通过下降边缘和节点的边缘和节点来加速GNNNNN,这实际上避免了现有方法两种类型的互补力量。在子线平行的培训中,我们提议让每个装置将其邻居的历史嵌入其他子图中。因此,我们的方法不会丢掉任何邻国,也不会让它们成为高端的通讯中心,而是不断更新它们的速度。 。这个实验性实验性分析显示整个GONNNR的深度的深度的深度。我们是如何在高点的深度上, 。我们可以证明在不断的深度分析。我们不断的精确的精确的精确的计算。我们可以证明。