Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.
翻译:神经网络图(GNNs)是从表达力和一般化的角度研究的。 然而,它们的最佳性能却不那么清楚。 我们首先通过研究GNNs的梯度动态来分析GNN培训。 首先,我们分析线性GNs,并证明尽管培训不协调,但在我们通过现实世界图表验证的轻度假设下,以线性速度向全球最低水平的趋同得到保障。第二,我们研究哪些可能影响GNNs的培训速度。我们的结果显示,通过跳过连接、更深度和/或良好的标签分布,GNNs的培训会自动加快。 Empractical结果证实,我们对线性GNNs的理论结果与非线性GNNs的培训行为是一致的。我们的结果为GNs成功提供了初步的理论支持,在优化方面可以跳过连接,并表明具有跳过连接的深度GNNNs在实践上是有希望的。