The prevailing graph neural network models have achieved significant progress in graph representation learning. However, in this paper, we uncover an ever-overlooked phenomenon: the pre-trained graph representation learning model tested with full graphs underperforms the model tested with well-pruned graphs. This observation reveals that there exist confounders in graphs, which may interfere with the model learning semantic information, and current graph representation learning methods have not eliminated their influence. To tackle this issue, we propose Robust Causal Graph Representation Learning (RCGRL) to learn robust graph representations against confounding effects. RCGRL introduces an active approach to generate instrumental variables under unconditional moment restrictions, which empowers the graph representation learning model to eliminate confounders, thereby capturing discriminative information that is causally related to downstream predictions. We offer theorems and proofs to guarantee the theoretical effectiveness of the proposed approach. Empirically, we conduct extensive experiments on a synthetic dataset and multiple benchmark datasets. The results demonstrate that compared with state-of-the-art methods, RCGRL achieves better prediction performance and generalization ability.
翻译:目前流行的图形神经网络模型在图形代表性学习方面取得了显著进展。然而,在本文中,我们发现了一个日益被人们忽视的现象:经过培训的图表代表性学习模型,经过全面图表测试,其功能低于通过良好图表测试的模型。这一观察表明,在图表中存在着一些混杂者,这些混杂者可能会干扰模型学习语义信息,而当前的图形代表性学习方法并没有消除其影响。为了解决这一问题,我们提议用Robust Causal图形代表性学习(RCGRL)学习强健的图表形式,以对抗混结效应。RCGRL采用了一种积极的方法,在无条件的限制条件下生成工具变量,使图形代表性学习模型能够消除混杂者,从而捕捉出与下游预测有因果关系的歧视性信息。我们提供理论和证据,以保证拟议方法的理论有效性。我们很生动地在合成数据集和多个基准数据集上进行广泛的实验。结果表明,与最先进的方法相比,RCGRGRL实现了更好的预测绩效和一般化能力。