Many Graph Neural Networks (GNNs) perform poorly compared to simple heuristics on Link Prediction (LP) tasks. This is due to limitations in expressive power such as the inability to count triangles (the backbone of most LP heuristics) and because they can not distinguish automorphic nodes (those having identical structural roles). Both expressiveness issues can be alleviated by learning link (rather than node) representations and incorporating structural features such as triangle counts. Since explicit link representations are often prohibitively expensive, recent works resorted to subgraph-based methods, which have achieved state-of-the-art performance for LP, but suffer from poor efficiency due to high levels of redundancy between subgraphs. We analyze the components of subgraph GNN (SGNN) methods for link prediction. Based on our analysis, we propose a novel full-graph GNN called ELPH (Efficient Link Prediction with Hashing) that passes subgraph sketches as messages to approximate the key components of SGNNs without explicit subgraph construction. ELPH is provably more expressive than Message Passing GNNs (MPNNs). It outperforms existing SGNN models on many standard LP benchmarks while being orders of magnitude faster. However, it shares the common GNN limitation that it is only efficient when the dataset fits in GPU memory. Accordingly, we develop a highly scalable model, called BUDDY, which uses feature precomputation to circumvent this limitation without sacrificing predictive performance. Our experiments show that BUDDY also outperforms SGNNs on standard LP benchmarks while being highly scalable and faster than ELPH.
翻译:许多图形神经网络(GNNs)与Link Survicition(LP)任务简单的超常理论相比表现不佳。这是因为表现力有限,例如无法计数三角(大多数LP Huristics的主干),而且无法区分自定义节点(具有相同的结构角色)。两种表情问题都可以通过学习链接(而非结点)表达方式和包含三角计等结构特征来缓解。由于明确的连接表示方式往往费用高昂,最近的工作采用基于子绘图的方法,这些方法已经实现了LP的最先进的性能基准,但由于子图之间的冗余度很高,从而导致效率低下。我们分析了GNNN(SG GNNN)子节点的组件,根据我们的分析,我们建议一个全局的GNNPH(与Hashing之间的有效联系)演示,只是将子缩略图作为信息,以近似于SGNNP的关键组成部分,而没有明确的子绘图构造。EPH是比GNNF的通用标准序列更快的模型,而我们在GNNF的模型上则更明显地显示GNNND标准。