Following a fast initial breakthrough in graph based learning, Graph Neural Networks (GNNs) have reached a widespread application in many science and engineering fields, prompting the need for methods to understand their decision process. GNN explainers have started to emerge in recent years, with a multitude of methods both novel or adapted from other domains. To sort out this plethora of alternative approaches, several studies have benchmarked the performance of different explainers in terms of various explainability metrics. However, these earlier works make no attempts at providing insights into why different GNN architectures are more or less explainable, or which explainer should be preferred in a given setting. In this survey, we fill these gaps by devising a systematic experimental study, which tests ten explainers on eight representative architectures trained on six carefully designed graph and node classification datasets. With our results we provide key insights on the choice and applicability of GNN explainers, we isolate key components that make them usable and successful and provide recommendations on how to avoid common interpretation pitfalls. We conclude by highlighting open questions and directions of possible future research.
翻译:图表神经网络(GNN)在基于图表的学习中取得了快速的初步突破之后,在许多科学和工程领域取得了广泛的应用,这促使需要各种方法来理解其决策程序。近年来,GNN解释者开始出现,采用许多新颖的方法或从其他领域改编的方法。为了分解这种过多的替代方法,一些研究从各种可解释性衡量标准的角度对不同解释者的业绩进行了基准衡量。然而,这些早期的工作并没有试图提供洞察,说明为什么不同的GNN结构或多或少是可以解释的,或者在特定环境中应该更偏爱哪些解释者。在这次调查中,我们通过设计系统化的实验研究来填补这些差距,对经过6个精心设计的图表和节点分类数据集培训的8个代表性结构的10个解释者进行测试。根据我们的成果,我们提供了关于GNN解释者的选择和适用性的关键见解,我们孤立了能够使用和成功的关键组成部分,并就如何避免共同解释陷阱提出了建议。我们最后通过强调开放的问题和今后可能的研究方向来总结这些差距。