Graph Neural Networks (GNNs) are effective in many applications. Still, there is a limited understanding of the effect of common graph structures on the learning process of GNNs. In this work, we systematically study the impact of community structure on the performance of GNNs in semi-supervised node classification on graphs. Following an ablation study on six datasets, we measure the performance of GNNs on the original graphs, and the change in performance in the presence and the absence of community structure. Our results suggest that communities typically have a major impact on the learning process and classification performance. For example, in cases where the majority of nodes from one community share a single classification label, breaking up community structure results in a significant performance drop. On the other hand, for cases where labels show low correlation with communities, we find that the graph structure is rather irrelevant to the learning process, and a feature-only baseline becomes hard to beat. With our work, we provide deeper insights in the abilities and limitations of GNNs, including a set of general guidelines for model selection based on the graph structure.
翻译:神经网络图(GNNs)在许多应用中是有效的。然而,对于通用图形结构对GNNs学习过程的影响了解有限。在这项工作中,我们系统地研究在半监督的图形节点分类中,社区结构对GNNs绩效的影响。在对六个数据集进行反动研究之后,我们测量了GNNs在原始图表上的性能,以及在有社区存在和没有社区结构的情况下的性能变化。我们的结果表明,社区通常对学习过程和分类性能产生重大影响。例如,在一个社区的多数节点共用一个单一分类标签,在显著的性能下降中打破社区结构的结果。另一方面,对于标签显示与社区关系不大的情况,我们发现,图形结构与学习过程相当无关,只使用特性的基线很难被击败。我们的工作使我们更深入地了解GNNPs的能力和局限性,包括一套基于图形结构的模型选择一般准则。