Node classification is a fundamental graph-based task that aims to predict the classes of unlabeled nodes, for which Graph Neural Networks (GNNs) are the state-of-the-art methods. Current GNNs assume that nodes in the training set contribute equally during training. However, the quality of training nodes varies greatly, and the performance of GNNs could be harmed by two types of low-quality training nodes: (1) inter-class nodes situated near class boundaries that lack the typical characteristics of their corresponding classes. Because GNNs are data-driven approaches, training on these nodes could degrade the accuracy. (2) mislabeled nodes. In real-world graphs, nodes are often mislabeled, which can significantly degrade the robustness of GNNs. To mitigate the detrimental effect of the low-quality training nodes, we present CLNode, which employs a selective training strategy to train GNN based on the quality of nodes. Specifically, we first design a multi-perspective difficulty measurer to accurately measure the quality of training nodes. Then, based on the measured qualities, we employ a training scheduler that selects appropriate training nodes to train GNN in each epoch. To evaluate the effectiveness of CLNode, we conduct extensive experiments by incorporating it in six representative backbone GNNs. Experimental results on real-world networks demonstrate that CLNode is a general framework that can be combined with various GNNs to improve their accuracy and robustness.
翻译:节点分类是一项基于图表的基本任务,目的是预测无标签节点的等级,而图神经网络(GNN)是其中最先进的方法。当前的GNNS假设培训组合中的节点在培训期间的贡献是平等的。然而,培训节点的质量差异很大,而GNNs的表现可能受到两类低质量培训节点的损害:(1) 位于各班级边界附近、缺乏相应班级典型特征的跨级节点。由于GNS是数据驱动的方法,因此这些节点的培训可能降低准确性。(2) 错误的节点。在现实世界的图表中,节点往往被错误地标出,这可以大大降低GNNN的稳健性。为减轻低质量培训节点的有害影响,我们介绍CLNode,使用选择性的培训策略,根据节点质量对GNNL进行培训。我们首先设计一个多角度的测量器测量器,以便准确测量培训节点的质量。然后,根据测量的品质,没有错误的节点。在现实的节点中,我们用一个真正的GNNNL 将一个真正的矩阵测试结果,然后用一个真正的GNNL 。