Explainability of graph neural networks (GNNs) aims to answer ``Why the GNN made a certain prediction?'', which is crucial to interpret the model prediction. The feature attribution framework distributes a GNN's prediction to its input features (e.g., edges), identifying an influential subgraph as the explanation. When evaluating the explanation (i.e., subgraph importance), a standard way is to audit the model prediction based on the subgraph solely. However, we argue that a distribution shift exists between the full graph and the subgraph, causing the out-of-distribution problem. Furthermore, with an in-depth causal analysis, we find the OOD effect acts as the confounder, which brings spurious associations between the subgraph importance and model prediction, making the evaluation less reliable. In this work, we propose Deconfounded Subgraph Evaluation (DSE) which assesses the causal effect of an explanatory subgraph on the model prediction. While the distribution shift is generally intractable, we employ the front-door adjustment and introduce a surrogate variable of the subgraphs. Specifically, we devise a generative model to generate the plausible surrogates that conform to the data distribution, thus approaching the unbiased estimation of subgraph importance. Empirical results demonstrate the effectiveness of DSE in terms of explanation fidelity.
翻译:图形神经网络(GNN)的可解释性旨在回答“为什么GNN做了某种预测?”,这是解释模型预测的关键。特征归属框架将GNN的预测散布到其输入特征(例如边缘),确定有影响力的子图作为解释。在评价解释(即子线重要性)时,标准的方法是审计仅以子图为基础的模型预测。然而,我们争辩说,完整的图形和子图之间存在分布变化,造成分配问题。此外,通过深入的因果关系分析,我们发现OOOD效应是混结者,这在子图重要性和模型预测之间带来虚假的联系,使评价更不那么可靠。在这项工作中,我们提议“无根据的子图评估”评估模型预测的解释性子图的因果关系。虽然分布变化一般难以控制,但我们采用前门调整,并引入子图的替代变量。具体地说,我们设计了OODD效应影响模型的遗传模型,在子图重要性和模型预测性分析中产生真实性分析结果的重要性。