Since the proposal of the graph neural network (GNN) by Gori et al. (2005) and Scarselli et al. (2008), one of the major problems in training GNNs was their struggle to propagate information between distant nodes in the graph. We propose a new explanation for this problem: GNNs are susceptible to a bottleneck when aggregating messages across a long path. This bottleneck causes the over-squashing of exponentially growing information into fixed-size vectors. As a result, GNNs fail to propagate messages originating from distant nodes and perform poorly when the prediction task depends on long-range interaction. In this paper, we highlight the inherent problem of over-squashing in GNNs. We demonstrate that the bottleneck hinders popular GNNs from fitting long-range signals in the training data. We further show that GNNs that absorb incoming edges equally, such as GCN and GIN, are more susceptible to over-squashing than GAT and GGNN. Finally, we show that prior work, which extensively tuned GNN models of long-range problems, suffer from over-squashing, and that breaking the bottleneck improves their state-of-the-art results without any tuning or additional weights. Our code is available at https://github.com/tech-srl/bottleneck/ .
翻译:由于Gori等人(2005年)和Scarselli等人(2008年)提议建立图形神经网络(GNN),因此,Gori等人(2005年)和Scarselli等人(2008年)在培训GNNS方面的主要问题之一是他们在图中努力在遥远的节点之间传播信息。我们建议对此问题作出新的解释:GNNS在将信息汇集到一条漫长的道路上时容易陷入瓶颈状态。这种瓶颈导致大量增长的信息在固定尺寸的矢量中过度分散。因此,GNNS未能传播来自遥远节点的信息,在预测任务取决于远程互动时表现不佳。在本文中,我们强调GNNNS过度夸大其内在的问题。我们表明,GNNS在培训数据中无法使用合适的远程信号。我们进一步表明,同样吸收进入边缘的GCN和GIN等GNNNS比GAT和GGNNNN更易受过度夸大影响。最后,我们表明,以前广泛调整GNNW模型的长距离问题模型在GNNNNNNS中的固有问题中存在的问题。我们无法进行超重的改进或超重的状态/转。