Graph neural networks (GNNs) have gained traction over the past few years for their superior performance in numerous machine learning tasks. Graph Convolutional Neural Networks (GCN) are a common variant of GNNs that are known to have high performance in semi-supervised node classification (SSNC), and work well under the assumption of homophily. Recent literature has highlighted that GCNs can achieve strong performance on heterophilous graphs under certain "special conditions". These arguments motivate us to understand why, and how, GCNs learn to perform SSNC. We find a positive correlation between similarity of latent node embeddings of nodes within a class and the performance of a GCN. Our investigation on underlying graph structures of a dataset finds that a GCN's SSNC performance is significantly influenced by the consistency and uniqueness in neighborhood structure of nodes within a class.
翻译:过去几年来,图形神经网络(GNNs)因其在众多机器学习任务方面的优异表现而获得了牵引力。图表进化神经网络(GCN)是GNNs的一个常见变体,已知在半监督节点分类(SSNC)中具有高性能,在同质假设下运作良好。最近的文献强调,GCNs在某些“特殊条件”下可以在异性图形上取得强力性能。这些论点促使我们理解GCNs学习如何执行SSNC。我们发现,在某个类中,节点的潜在节点嵌入与GCN的性能之间存在正相关关系。我们对数据集基本图结构的调查发现,GCNs的SSNC性能在很大程度上受到一个类中节点邻里结构的一致性和独特性的影响。