We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.
翻译:我们考虑如何解释图形神经网络(GNN)的预测,这些预测本来被视为黑盒。现有的方法总是侧重于解释图形节点或边缘的重要性,但忽略了图表的子结构,这些小结构更直观,更人能理解。在这项工作中,我们提出了一个新颖的方法,称为SubgraphX,通过识别重要的子图来解释GNN。根据经过培训的GNN模型和输入图,我们的SubgraphX通过在蒙特卡洛树搜索中有效探索不同的子图来解释其预测。为了使树搜索更加有效,我们提议使用Shapley值作为子图重要性的衡量标准,这也能够捕捉不同子图之间的相互作用。为了加快计算,我们提出了高效的近似方法,用以计算图形数据中的Shapley值。我们的工作是首次尝试通过明确和直接的子图来解释GNNN。实验结果表明,我们的SubgraphX在合理水平上进行计算。