Graph neural networks (GNNs) have shown superiority in many prediction tasks over graphs due to their impressive capability of capturing nonlinear relations in graph-structured data. However, for node classification tasks, often, only marginal improvement of GNNs over their linear counterparts has been observed. Previous works provide very few understandings of this phenomenon. In this work, we resort to Bayesian learning to deeply investigate the functions of non-linearity in GNNs for node classification tasks. Given a graph generated from the statistical model CSBM, we observe that the max-a-posterior estimation of a node label given its own and neighbors' attributes consists of two types of non-linearity, a possibly non-linear transformation of node attributes and a ReLU-activated feature aggregation from neighbors. The latter surprisingly matches the type of non-linearity used in many GNN models. By further imposing Gaussian assumption on node attributes, we prove that the superiority of those ReLU activations is only significant when the node attributes are far more informative than the graph structure, which nicely matches many previous empirical observations. A similar argument can be achieved when there is a distribution shift of node attributes between the training and testing datasets. Finally, we verify our theory on both synthetic and real-world networks.
翻译:图像神经网络(GNNs) 在许多预测任务中表现出优于图表的优势, 因为它们在图形结构数据中捕捉非线性关系的能力令人印象深刻。 然而,对于节点分类任务,通常只观察到GNNs对线性对应方稍有改善。 先前的作品很少能对这一现象有了解。 在这项工作中, 我们利用巴伊西亚学习来深入调查GNNs非线性功能用于节点分类任务。 根据统计模型 CSBM 生成的图表, 我们观察到, 节点标签的内含自和邻居属性的顶级估计包括两种非线性类型, 可能是节点属性的非线性转换, 以及邻居的雷卢激活特性集合。 后一种令人惊讶的是, 与许多GNNM模型中使用的非线性功能性功能的种类相匹配。 通过进一步将高斯的假设强加在节点性属性上, 我们证明, 只有当节点特性远比图形结构更加丰富, 并且与许多先前的经验性观察结构相匹配时, 才能进行类似的理论性分析。 最后, 我们的模型的分布是相同的, 。 当我们没有进行类似的理论性测试时, 。