Message Passing Neural Networks (MPNNs) are widely used for learning on graphs, but their ability to process long-range information is limited by the phenomenon of oversquashing. This limitation has led some researchers to advocate Graph Transformers as a better alternative, whereas others suggest that it can be mitigated within the MPNN framework, using virtual nodes or other rewiring techniques. In this work, we demonstrate that oversquashing is not limited to long-range tasks, but can also arise in short-range problems. This observation allows us to disentangle two distinct mechanisms underlying oversquashing: (1) the bottleneck phenomenon, which can arise even in low-range settings, and (2) the vanishing gradient phenomenon, which is closely associated with long-range tasks. We further show that the short-range bottleneck effect is not captured by existing explanations for oversquashing, and that adding virtual nodes does not resolve it. In contrast, transformers do succeed in such tasks, positioning them as the more compelling solution to oversquashing, compared to specialized MPNNs.
翻译:消息传递神经网络(MPNNs)在图学习中被广泛应用,但其处理长程信息的能力受到过压缩现象的限制。这一局限促使部分研究者主张图Transformer是更优的替代方案,而另一些研究者则认为可通过虚拟节点或其他重布线技术在MPNN框架内缓解该问题。本研究表明,过压缩现象不仅存在于长程任务中,在短程问题中同样可能出现。这一发现使我们能够区分过压缩背后的两种不同机制:(1)瓶颈现象——即使在短程场景中也可能出现;(2)梯度消失现象——主要与长程任务相关。我们进一步证明,现有过压缩理论解释未能涵盖短程瓶颈效应,且添加虚拟节点无法解决该问题。相比之下,Transformer在此类任务中表现成功,表明其相较于专用MPNNs是应对过压缩问题更具说服力的解决方案。