Graph Neural Networks (GNNs) have achieved promising results for semi-supervised learning tasks on graphs such as node classification. Despite the great success of GNNs, many real-world graphs are often sparsely and noisily labeled, which could significantly degrade the performance of GNNs, as the noisy information could propagate to unlabeled nodes via graph structure. Thus, it is important to develop a label noise-resistant GNN for semi-supervised node classification. Though extensive studies have been conducted to learn neural networks with noisy labels, they mostly focus on independent and identically distributed data and assume a large number of noisy labels are available, which are not directly applicable for GNNs. Thus, we investigate a novel problem of learning a robust GNN with noisy and limited labels. To alleviate the negative effects of label noise, we propose to link the unlabeled nodes with labeled nodes of high feature similarity to bring more clean label information. Furthermore, accurate pseudo labels could be obtained by this strategy to provide more supervision and further reduce the effects of label noise. Our theoretical and empirical analysis verify the effectiveness of these two strategies under mild conditions. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed method in learning a robust GNN with noisy and limited labels.
翻译:神经网络(GNNs)在节点分类等图表的半监督性学习任务方面取得了可喜的成果。尽管GNNs取得了巨大成功,但许多真实世界的图表往往鲜少和有声的标签,这可能会显著降低GNNs的性能,因为噪音信息可以通过图形结构向无标签节点传播。因此,开发一个标签防噪的GNN(GNN)对于半监督性节点分类十分重要。虽然已经进行了广泛的研究,以学习带有噪音标签的神经网络,但它们主要侧重于独立和同样分布的数据,并假设有大量的噪音标签,这些标签不能直接适用于GNNS。因此,我们调查了一个新问题,即学习一个强健的GNNN(G),带有噪音和有限的标签。为了减轻标签噪音的负面影响,我们提议将无标签的无标签节点与高特征的标签节点联系起来,以带来更清洁的标签信息。此外,通过这一战略可以获取准确的假标签,以提供更多的监督,进一步减少标签噪音的影响。我们理论和经验上的分析对GNNNN值的精确性试验在两个战略下证实了真正的全球研究的有效性。