Advances in machine learning have led to graph neural network-based methods for drug discovery, yielding promising results in molecular design, chemical synthesis planning, and molecular property prediction. However, current graph neural networks (GNNs) remain of limited acceptance in drug discovery is limited due to their lack of interpretability. Although this major weakness has been mitigated by the development of explainable artificial intelligence (XAI) techniques, the "ground truth" assignment in most explainable tasks ultimately rests with subjective judgments by humans so that the quality of model interpretation is hard to evaluate in quantity. In this work, we first build three levels of benchmark datasets to quantitatively assess the interpretability of the state-of-the-art GNN models. Then we implemented recent XAI methods in combination with different GNN algorithms to highlight the benefits, limitations, and future opportunities for drug discovery. As a result, GradInput and IG generally provide the best model interpretability for GNNs, especially when combined with GraphNet and CMPNN. The integrated and developed XAI package is fully open-sourced and can be used by practitioners to train new models on other drug discovery tasks.
翻译:机器学习的进展导致以图形神经网络为基础的药物发现方法,在分子设计、化学合成规划以及分子属性预测方面产生了有希望的结果。然而,目前的图形神经网络(GNN)由于缺乏解释性,在药物发现方面接受程度仍然有限。虽然由于开发了可解释的人工智能(XAI)技术,减轻了这一重大弱点,但在大多数可解释任务中的“地面真相”任务最终取决于人类的主观判断,因此模型解释的质量很难在数量上进行评估。在这项工作中,我们首先建立了三个基准数据集,以定量评估最先进的GNNN模型的可解释性。然后,我们与不同的GNN算法一起实施了最近的XAI方法,以突出药物发现的好处、局限性和未来机会。结果,GradInput和IG一般为GNs提供了最佳的可解释性模型,特别是当与GapNet和CMPNN相结合时。综合和开发的XAAI软件包是完全开源的,可供从业人员使用,可以用来培训其他药物发现任务的新模型。