Rationale is defined as a subset of input features that best explains or supports the prediction by machine learning models. Rationale identification has improved the generalizability and interpretability of neural networks on vision and language data. In graph applications such as molecule and polymer property prediction, identifying representative subgraph structures named as graph rationales plays an essential role in the performance of graph neural networks. Existing graph pooling and/or distribution intervention methods suffer from lack of examples to learn to identify optimal graph rationales. In this work, we introduce a new augmentation operation called environment replacement that automatically creates virtual data examples to improve rationale identification. We propose an efficient framework that performs rationale-environment separation and representation learning on the real and augmented examples in latent spaces to avoid the high complexity of explicit graph decoding and encoding. Comparing against recent techniques, experiments on seven molecular and four polymer real datasets demonstrate the effectiveness and efficiency of the proposed augmentation-based graph rationalization framework.
翻译:理由识别改善了视觉和语言数据神经网络的一般性和可解释性。在分子和聚合物属性预测等图形应用中,确定被称为图形原理的代表性子图结构在图形神经网络的运行中起着重要作用。现有的图形集合和/或分布干预方法缺乏实例,无法学习如何确定最佳图形原理。在这项工作中,我们引入了称为环境替换的新增强行动,称为环境替换,自动生成虚拟数据示例,以改进理由识别。我们提出了一个高效框架,在潜在空间进行理论-环境分离和代表对真实和强化实例的学习,以避免清晰图形解码和编码的高度复杂性。与近期技术、7个分子和4个聚合物真实数据集的实验相比,显示了拟议的基于增强的图形合理化框架的有效性和效率。