培训图图 1000层的神经网络 (Training Graph Neural Networks with 1000 Layers)

from arxiv, Accepted at ICML'2021. Code available at https://www.deepgcns.org/arch/gnn1000. Work done during Guohao Li's internship at Intel Intelligent Systems Lab

Deep graph neural networks (GNNs) have achieved excellent results on various tasks on increasingly large graph datasets with millions of nodes and edges. However, memory complexity has become a major obstacle when training deep GNNs for practical applications due to the immense number of nodes, edges, and intermediate activations. To improve the scalability of GNNs, prior works propose smart graph sampling or partitioning strategies to train GNNs with a smaller set of nodes or sub-graphs. In this work, we study reversible connections, group convolutions, weight tying, and equilibrium models to advance the memory and parameter efficiency of GNNs. We find that reversible connections in combination with deep network architectures enable the training of overparameterized GNNs that significantly outperform existing methods on multiple datasets. Our models RevGNN-Deep (1001 layers with 80 channels each) and RevGNN-Wide (448 layers with 224 channels each) were both trained on a single commodity GPU and achieve an ROC-AUC of $87.74 \pm 0.13$ and $88.24 \pm 0.15$ on the ogbn-proteins dataset. To the best of our knowledge, RevGNN-Deep is the deepest GNN in the literature by one order of magnitude. Please visit our project website https://www.deepgcns.org/arch/gnn1000 for more information.

翻译：深图神经网络(GNNs)在使用数百万节点和边缘的日益庞大的图表数据集方面,在各种任务中取得了优异的成果。然而,由于节点、边缘和中间激活的数量之多,在为实际应用而培训深度GNNs时,记忆复杂性已成为一个主要障碍。为了提高GNNs的可缩放性,先前的工作提议了智能图形抽样或分区战略,用一套较小的节点或子图来培训GNNs(448层,每个224个频道)。在这项工作中,我们研究可逆连接、群变组合、权重搭和平衡模型,以提高GNNNS的记忆和参数效率。我们发现,与深网络结构相结合的可逆连接,使得对大大超出多数据集现有方法的多度分解的GNNNNs(GNN)系统(1001层,每个频道80个频道)的培训能够对GNNNN-Wide(448层,每个频道224个频道)进行。我们用单一商品GPUP(ROC-AC) 8774c m0.13美元和8.8GNNNS-pnxxxx_0.24我们最深的G) 最深的G-NUSxxxxxxx_0.15xxxxxxxxxxx0.15xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0.0.0.0.0.0.0.0.0.15xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.