Graph Neural Networks (GNNs) have shown great ability in modeling graph-structured data for various domains. However, GNNs are known as black-box models that lack interpretability. Without understanding their inner working, we cannot fully trust them, which largely limits their adoption in high-stake scenarios. Though some initial efforts have been taken to interpret the predictions of GNNs, they mainly focus on providing post-hoc explanations using an additional explainer, which could misrepresent the true inner working mechanism of the target GNN. The works on self-explainable GNNs are rather limited. Therefore, we study a novel problem of learning prototype-based self-explainable GNNs that can simultaneously give accurate predictions and prototype-based explanations on predictions. We design a framework which can learn prototype graphs that capture representative patterns of each class as class-level explanations. The learned prototypes are also used to simultaneously make prediction for for a test instance and provide instance-level explanation. Extensive experiments on real-world and synthetic datasets show the effectiveness of the proposed framework for both prediction accuracy and explanation quality.
翻译:神经网络(GNNs)在为不同领域模拟图形结构数据方面表现出巨大的能力。然而,GNNs被称为缺乏可解释性的黑盒模型。我们不理解它们的内部工作,就无法完全信任它们,这在很大程度上限制了它们被采用到高镜头的情景中。虽然已作出一些初步努力来解释GNNs预测,但它们主要侧重于使用额外解释器提供热后解释,这可能会歪曲目标GNN的真正内部工作机制。关于可自我解释的GNs的工程相当有限。因此,我们研究了学习基于原型的可自我解释的GNNs的新问题,它既能提供准确的预测,又能对预测作出原型解释。我们设计了一个框架,可以学习典型图,将每个类别的代表性模式作为等级的解释。所学的原型模型还被用来同时为试验实例作出预测并提供实例性解释。关于现实世界和合成数据集的大规模实验显示了拟议框架在预测准确性和解释质量方面的有效性。