Many recent works have studied the performance of Graph Neural Networks (GNNs) in the context of graph homophily - a label-dependent measure of connectivity. Traditional GNNs generate node embeddings by aggregating information from a node's neighbors in the graph. Recent results in node classification tasks show that this local aggregation approach performs poorly in graphs with low homophily (heterophilic graphs). Several mechanisms have been proposed to improve the accuracy of GNNs on such graphs by increasing the aggregation range of a GNN layer, either through multi-hop aggregation, or through long-range aggregation from distant nodes. In this paper, we show that properly tuned classical GNNs and multi-layer perceptrons match or exceed the accuracy of recent long-range aggregation methods on heterophilic graphs. Thus, our results highlight the need for alternative datasets to benchmark long-range GNN aggregation mechanisms. We also show that homophily is a poor measure of the information in a node's local neighborhood and propose the Neighborhood Information Content(NIC) metric, which is a novel information-theoretic graph metric. We argue that NIC is more relevant for local aggregation methods as used by GNNs. We show that, empirically, it correlates better with GNN accuracy in node classification tasks than homophily.
翻译:近期许多工作都研究了图形神经网络(GNN)在图形同质(一个依赖标签的连接度)背景下的功能。传统的GNNS通过在图表中汇总一个节点邻居的信息生成节点嵌入。 节点分类任务的最新结果表明,这种本地汇总方法在以低同质(血清哲学图)绘制的图表中表现不佳。 已经建议了几个机制来提高这些图表中GNN的准确性,办法是通过多点聚合或通过远点的节点远程聚合来增加GNN层的汇总范围。 在本文中,我们展示了对古典GNNNN和多层透视器进行适当调整的匹配或超过最近远程汇总方法的准确性。 因此,我们的结果凸显了需要替代数据集来为远程GNNNNG汇总机制进行基准。 我们还表明,在节点的本地社区中,单点是测量信息的差差量范围,建议使用NIW(NI)信息量度(NI)衡量标准,我们用新的G-NNING方法来说明,我们更精确地将它作为新的G数据数据分类。