Recently popularized graph neural networks achieve the state-of-the-art accuracy on a number of standard benchmark datasets for graph-based semi-supervised learning, improving significantly over existing approaches. These architectures alternate between a propagation layer that aggregates the hidden states of the local neighborhood and a fully-connected layer. Perhaps surprisingly, we show that a linear model, that removes all the intermediate fully-connected layers, is still able to achieve a performance comparable to the state-of-the-art models. This significantly reduces the number of parameters, which is critical for semi-supervised learning where number of labeled examples are small. This in turn allows a room for designing more innovative propagation layers. Based on this insight, we propose a novel graph neural network that removes all the intermediate fully-connected layers, and replaces the propagation layers with attention mechanisms that respect the structure of the graph. The attention mechanism allows us to learn a dynamic and adaptive local summary of the neighborhood to achieve more accurate predictions. In a number of experiments on benchmark citation networks datasets, we demonstrate that our approach outperforms competing methods. By examining the attention weights among neighbors, we show that our model provides some interesting insights on how neighbors influence each other.
翻译:最近广受欢迎的图形神经网络在一些基于图形的半监督的半监督学习标准基准数据集中实现了最新的最新精确度,大大改进了现有方法。这些结构在汇集当地社区隐藏状态和完全连接层的传播层之间交替。也许令人惊讶的是,我们显示一个删除所有中间完全连接层的线性模型仍然能够取得与最先进模型相近的性能。这大大减少了参数数量,这对于半监督的学习具有关键意义的参数数量,而半监督学习的标签实例数量很少。这反过来又为设计更创新的传播层提供了空间。基于这一洞察,我们提出了一个新的图形神经网络,删除了所有中间完全连接层,并以尊重图形结构的注意机制取代了传播层。关注机制使我们能够学习邻居动态和适应性的地方摘要,以获得更准确的预测。在基准引用网络数据集的一些实验中,我们展示了我们的方法优于相互竞争的方法。我们通过审视每个邻居的注意程度,我们展示了我们各自对邻居的注意程度的影响。