The class imbalance problem, as an important issue in learning node representations, has drawn increasing attention from the community. Although the imbalance considered by existing studies roots from the unequal quantity of labeled examples in different classes (quantity imbalance), we argue that graph data expose a unique source of imbalance from the asymmetric topological properties of the labeled nodes, i.e., labeled nodes are not equal in terms of their structural role in the graph (topology imbalance). In this work, we first probe the previously unknown topology-imbalance issue, including its characteristics, causes, and threats to semi-supervised node classification learning. We then provide a unified view to jointly analyzing the quantity- and topology- imbalance issues by considering the node influence shift phenomenon with the Label Propagation algorithm. In light of our analysis, we devise an influence conflict detection -- based metric Totoro to measure the degree of graph topology imbalance and propose a model-agnostic method ReNode to address the topology-imbalance issue by re-weighting the influence of labeled nodes adaptively based on their relative positions to class boundaries. Systematic experiments demonstrate the effectiveness and generalizability of our method in relieving topology-imbalance issue and promoting semi-supervised node classification. The further analysis unveils varied sensitivity of different graph neural networks (GNNs) to topology imbalance, which may serve as a new perspective in evaluating GNN architectures.
翻译:阶级不平衡问题,作为学习节点代表中的一个重要问题,引起了社区越来越多的关注。虽然现有研究认为,不同类别标签例子数量不平等(数量不平衡)是造成不同类别标签例子数量不平等的根源,但我们认为,图表数据揭示了与标签节点的不对称地貌性质不相称的不平衡现象的独特来源,即标签节点在图形的结构作用方面不相等(地形不平衡)。在这项工作中,我们首先探究先前未知的表层――平衡问题,包括其特点、原因和对半监督节点分类学习的威胁。然后,我们提供一个统一的观点,通过考虑与Label 推进算法的节点影响变化现象,共同分析数量和表层不平衡问题。根据我们的分析,我们设计了一种影响冲突探测的方法 -- -- 以图示不平衡程度衡量图象不平衡的程度,并提出一种模型――不结盟方法,以解决表层――平衡问题,为此,我们重新权衡了基于相对位置的调适调的点对等级结构的影响,我们从结构结构上到等级界限的平衡性―― 系统化实验显示我们的最高结构的效能和一般性分析。