In recent years, benefiting from the expressivepower of Graph Convolutional Networks (GCNs),significant breakthroughs have been made in faceclustering. However, rare attention has been paidto GCN-based clustering on imbalanced data. Al-though imbalance problem has been extensivelystudied, the impact of imbalanced data on GCN-based linkage prediction task is quite different,which would cause problems in two aspects: im-balanced linkage labels and biased graph represen-tations. The problem of imbalanced linkage labelsis similar to that in image classification task, but thelatter is a particular problem in GCN-based clus-tering via linkage prediction. Significantly biasedgraph representations in training can cause catas-trophic overfitting of a GCN model. To tacklethese problems, we evaluate the feasibility of thoseexisting methods for imbalanced image classifica-tion problem on graphs with extensive experiments,and present a new method to alleviate the imbal-anced labels and also augment graph representa-tions using a Reverse-Imbalance Weighted Sam-pling (RIWS) strategy, followed with insightfulanalyses and discussions. A series of imbalancedbenchmark datasets synthesized from MS-Celeb-1M and DeepFashion will be openly available.
翻译:近些年来,由于图表革命网络(GCNs)的显性力量,在面临集群化方面已经取得了重大突破,然而,很少注意GCN对不平衡数据进行基于GCN的集群,尽管对不平衡问题进行了广泛的研究,对基于GCN的联系预测任务中不平衡数据的影响大不相同,这将在两个方面造成问题:联系标签不平衡和偏颇的图形反射。与图像分类任务相似的不平衡联系标签问题,但在基于GCN的封闭性预测中是特殊的问题。培训中的重大偏向性说明可能导致GCN模型的收缩性过度。为了解决问题,我们用广泛试验来评估图表中图像分级不平衡问题的现有方法的可行性,并且提出一种新的方法来减轻无平衡标签,并且用反向平衡的图像分类任务来增加图表代表一个问题,但是在基于GCN的链接的封闭性预测中是一个特别的问题。在培训中出现的严重偏向性表述可能会导致GCN模型的收缩性过度。为了解决问题,我们评估这些方法在图中造成图像分级化问题的可行性,并且提出一种新的方法来缓解不均匀标签标签的标签化标签,同时使用反偏向的缩缩缩图代表一个战略,然后从可得到深刻的合成的合成数据综合和深研制综合。