Graph Neural Networks(GNNs) are a family of neural models tailored for graph-structure data and have shown superior performance in learning representations for graph-structured data. However, training GNNs on large graphs remains challenging and a promising direction is distributed GNN training, which is to partition the input graph and distribute the workload across multiple machines. The key bottleneck of the existing distributed GNNs training framework is the across-machine communication induced by the dependency on the graph data and aggregation operator of GNNs. In this paper, we study the communication complexity during distributed GNNs training and propose a simple lossless communication reduction method, termed the Aggregation before Communication (ABC) method. ABC method exploits the permutation-invariant property of the GNNs layer and leads to a paradigm where vertex-cut is proved to admit a superior communication performance than the currently popular paradigm (edge-cut). In addition, we show that the new partition paradigm is particularly ideal in the case of dynamic graphs where it is infeasible to control the edge placement due to the unknown stochastic of the graph-changing process.
翻译:神经网络(GNNs)是一组为图形结构数据定制的神经模型,在图表结构数据学习演示中表现出优异的性能。然而,在大图表上培训GNNs仍然具有挑战性,并且分布了一个有希望的方向,即分割输入图和在多个机器中分配工作量。现有的分布式GNNs培训框架的关键瓶颈是依赖GNNs的图形数据和聚合操作器引发的跨机器通信。在本文中,我们研究了分布式GNs培训期间的通信复杂性,并提出了简单的无损通信减少方法,称为通信前聚合(ABC)方法。ABC方法利用GNNs层的变异属性,并导致一种模式,即垂直切割证明承认比目前流行的范式(顶点)优越的通信性能。此外,我们表明,在动态图形中,新的分区模式特别理想,因为无法控制边端位置,因为图形变化过程不为人所知。