Learning powerful representations is one central theme of graph neural networks (GNNs). It requires refining the critical information from the input graph, instead of the trivial patterns, to enrich the representations. Towards this end, graph attention and pooling methods prevail. They mostly follow the paradigm of "learning to attend". It maximizes the mutual information between the attended subgraph and the ground-truth label. However, this training paradigm is prone to capture the spurious correlations between the trivial subgraph and the label. Such spurious correlations are beneficial to in-distribution (ID) test evaluations, but cause poor generalization in the out-of-distribution (OOD) test data. In this work, we revisit the GNN modeling from the causal perspective. On the top of our causal assumption, the trivial information serves as a confounder between the critical information and the label, which opens a backdoor path between them and makes them spuriously correlated. Hence, we present a new paradigm of deconfounded training (DTP) that better mitigates the confounding effect and latches on the critical information, to enhance the representation and generalization ability. Specifically, we adopt the attention modules to disentangle the critical subgraph and trivial subgraph. Then we make each critical subgraph fairly interact with diverse trivial subgraphs to achieve a stable prediction. It allows GNNs to capture a more reliable subgraph whose relation with the label is robust across different distributions. We conduct extensive experiments on synthetic and real-world datasets to demonstrate the effectiveness.
翻译:学习强大的表达方式是图形神经网络( GNNS) 的核心主题之一。 它需要完善输入图中的关键信息, 而不是微不足道的模式, 以丰富表达方式。 为此, 平面关注和集合方法占上风。 它们大多遵循“ 学习出席” 模式。 它使所参与的子集和地面真相标签之间的相互信息最大化。 但是, 这个培训模式很容易捕捉琐碎的子集和标签之间的虚假关联。 因此, 这种虚假的关联有利于内部分配( ID) 测试评估, 但导致关键信息外分配( OOD) 测试数据的不全面化。 在这项工作中, 我们从因果关系的角度重新审视GNN的模型。 在我们的因果关系假设的顶端, 小点信息充当了关键信息与标签之间的连接点, 开启了它们之间的后门路径路径, 使它们具有误导性的关系。 因此, 我们提出了一个新的不可靠的培训模式( DTP), 以更好地减轻对广泛分布( IDOD) 测试数据的识别效果, 导致关键信息外分配和概括化能力。 具体地说, 我们采用一个子层的分层的分类, 来进行精确的分层的分层的分层分析。 我们用子分析, 进行稳定的分层的分层的分层的分层的分层的分解。