The Graph Neural Network (GNN) has achieved remarkable success in graph data representation. However, the previous work only considered the ideal balanced dataset, and the practical imbalanced dataset was rarely considered, which, on the contrary, is of more significance for the application of GNN. Traditional methods such as resampling, reweighting and synthetic samples that deal with imbalanced datasets are no longer applicable in GNN. Ensemble models can handle imbalanced datasets better compared with single estimator. Besides, ensemble learning can achieve higher estimation accuracy and has better reliability compared with the single estimator. In this paper, we propose an ensemble model called AdaGCN, which uses a Graph Convolutional Network (GCN) as the base estimator during adaptive boosting. In AdaGCN, a higher weight will be set for the training samples that are not properly classified by the previous classifier, and transfer learning is used to reduce computational cost and increase fitting capability. Experiments show that the AdaGCN model we proposed achieves better performance than GCN, GraphSAGE, GAT, N-GCN and the most of advanced reweighting and resampling methods on synthetic imbalanced datasets, with an average improvement of 4.3%. Our model also improves state-of-the-art baselines on all of the challenging node classification tasks we consider: Cora, Citeseer, Pubmed, and NELL.
翻译:图表神经网络(GNN)在图形数据显示方面取得了显著的成功,然而,先前的工作只考虑了理想的平衡数据集,而实际的不平衡数据集却很少被考虑,而后者对GNN的应用则具有更大的意义。传统方法,例如处理不平衡数据集的重新取样、重新加权和合成样本不再适用于GNN。组合模型可以比单一估算器更好地处理不平衡的数据集。此外,联合学习可以提高估算准确性,并且比单一估计器更可靠。在本文件中,我们提出了一个称为AdaGCN的混合模型,它使用图表革命网络(GCN)作为适应性提升期间的基础估计器。在AdaGCN中,对未经先前分类者适当分类的培训样品将设定更高的重量,转让学习将用来降低计算成本,提高适应能力。实验表明我们提议的AdaGCN模型比GCN、GGAGSGAGNCN的更好性能,这个模型使用图表革命网络作为调整的基数。 GAT-NAAT改进了我们平均的升级基准和升级方法。