Graph Neural Networks (GNNs) have proved to be an effective representation learning framework for graph-structured data, and have achieved state-of-the-art performance on many practical predictive tasks, such as node classification, link prediction and graph classification. Among the variants of GNNs, Graph Attention Networks (GATs) learn to assign dense attention coefficients over all neighbors of a node for feature aggregation, and improve the performance of many graph learning tasks. However, real-world graphs are often very large and noisy, and GATs are prone to overfitting if not regularized properly. Even worse, the local aggregation mechanism of GATs may fail on disassortative graphs, where nodes within local neighborhood provide more noise than useful information for feature aggregation. In this paper, we propose Sparse Graph Attention Networks (SGATs) that learn sparse attention coefficients under an $L_0$-norm regularization, and the learned sparse attentions are then used for all GNN layers, resulting in an edge-sparsified graph. By doing so, we can identify noisy/task-irrelevant edges, and thus perform feature aggregation on most informative neighbors. Extensive experiments on synthetic and real-world graph learning benchmarks demonstrate the superior performance of SGATs. In particular, SGATs can remove about 50\%-80\% edges from large assortative graphs, while retaining similar classification accuracies. On disassortative graphs, SGATs prune majority of noisy edges and outperform GATs in classification accuracies by significant margins. Furthermore, the removed edges can be interpreted intuitively and quantitatively. To the best of our knowledge, this is the first graph learning algorithm that shows significant redundancies in graphs and edge-sparsified graphs can achieve similar or sometimes higher predictive performances than original graphs.
翻译:神经网图( GNN) 已证明是图表结构数据的有效代表学习框架, 并且在许多实际预测性任务( 如节点分类、 链接预测和图形分类) 中取得了最先进的表现。 在 GNNS 变方中, 图形关注网络( GATs) 学会在特征聚合时在所有节点周围分配密集的注意系数, 并改进许多图表学习任务的绩效。 然而, 真实世界的图表往往非常大, 噪音, 并且GAT如果不正确调整, 也容易过度适应。 更糟糕的是, GAT的本地集合机制可能无法在破坏性图表上取得最先进的表现, 当地社区内的节点提供了比特性汇总有用的信息更多的噪音。 在本文中, 我们建议 微缩图形关注网络( SGATs) 在 $L_ 0 的调调整调整之下, 然后对所有 GNNNT 的平面图层使用最微的注意。 由此形成边缘的图表。 这样做, 我们就可以在50 AT 平面的平面的平面图像上找到一个不连续/ AT 最接近的平面的平面的图像, 和最接近性平局的平局的平局的平局的平局, 展示, 展示, 展示可以演演演。