Self-attention mechanism in graph neural networks (GNNs) led to state-of-the-art performance on many graph representation learning tasks. Currently, at every layer, attention is computed between connected pairs of nodes and depends solely on the representation of the two nodes. However, such attention mechanism does not account for nodes that are not directly connected but provide important network context. Here we propose Multi-hop Attention Graph Neural Network (MAGNA), a principled way to incorporate multi-hop context information into every layer of attention computation. MAGNA diffuses the attention scores across the network, which increases the receptive field for every layer of the GNN. Unlike previous approaches, MAGNA uses a diffusion prior on attention values, to efficiently account for all paths between the pair of disconnected nodes. We demonstrate in theory and experiments that MAGNA captures large-scale structural information in every layer, and has a low-pass effect that eliminates noisy high-frequency information from graph data. Experimental results on node classification as well as the knowledge graph completion benchmarks show that MAGNA achieves state-of-the-art results: MAGNA achieves up to 5.7 percent relative error reduction over the previous state-of-the-art on Cora, Citeseer, and Pubmed. MAGNA also obtains the best performance on a large-scale Open Graph Benchmark dataset. On knowledge graph completion MAGNA advances state-of-the-art on WN18RR and FB15k-237 across four different performance metrics.
翻译:图形神经网络(GNNs)的自我关注机制导致在许多图形化教学任务中取得最先进的业绩。 目前,在每一层,关注点都是在连接的节点对对配对之间计算,完全取决于两个节点的表示。 但是,这种关注机制没有考虑到没有直接连接但提供重要网络背景的节点。 我们在这里建议多点关注图像神经网络(MAGNA),这是将多点关注背景信息纳入每个关注计算层的一个原则性方法。 MAGNA在整个网络中分散了关注分数,这增加了GNNN的每个层的可接受字段。 与以往的做法不同,MAGNA在关注值之前使用一种传播方式,以有效核算断开的节点之间的所有路径。 我们在理论和实验中表明,MAGNA在每一层中收集了大规模结构信息,并且具有从图形数据中消除高频度信息的低通道效应。 节点分类的实验结果以及知识完成基准显示,MAGNA在最大端点上实现了最先进的域域域域域。 MAGNA 15 和四级的成绩图中, 也实现了前一等级的成绩BBB 。