Thanks to the increasing availability of drug-drug interactions (DDI) datasets and large biomedical knowledge graphs (KGs), accurate detection of adverse DDI using machine learning models becomes possible. However, it remains largely an open problem how to effectively utilize large and noisy biomedical KG for DDI detection. Due to its sheer size and amount of noise in KGs, it is often less beneficial to directly integrate KGs with other smaller but higher quality data (e.g., experimental data). Most of the existing approaches ignore KGs altogether. Some try to directly integrate KGs with other data via graph neural networks with limited success. Furthermore, most previous works focus on binary DDI prediction whereas the multi-typed DDI pharmacological effect prediction is a more meaningful but harder task. To fill the gaps, we propose a new method SumGNN: knowledge summarization graph neural network, which is enabled by a subgraph extraction module that can efficiently anchor on relevant subgraphs from a KG, a self-attention based subgraph summarization scheme to generate a reasoning path within the subgraph, and a multi-channel knowledge and data integration module that utilizes massive external biomedical knowledge for significantly improved multi-typed DDI predictions. SumGNN outperforms the best baseline by up to 5.54\%, and the performance gain is particularly significant in low data relation types. In addition, SumGNN provides interpretable prediction via the generated reasoning paths for each prediction.
翻译:由于药物-药物相互作用(DDI)数据集和大型生物医学知识图(KGs)越来越多,利用机器学习模型准确地探测不利DDI的可能性就有可能出现,然而,这在很大程度上仍然是一个未解决的问题,即如何有效利用大型和噪音的生物医学KG进行DDI探测。由于KGs的庞大规模和噪音数量,直接将KGs与其他较小但质量更高的数据(例如实验数据)整合起来往往不太有益。大多数现有办法完全忽略KGs。有些办法试图通过图形神经网络直接将KGs与其他数据整合在一起,但成功有限。此外,大多数以前的工作侧重于二进制DDI预测,而多型DDI药理效应预测则更有意义,但更难完成。为了填补空白,我们提出了一个新的方法SumGNNN:知识加和图形神经网络(例如实验数据),这个方法由能够有效地固定在来自KG的低级子图上,一个基于自控的子绘图加总图计划,以在子图中产生一条推理路径,在子图中,通过推理学推理法进行DMISDMISD(特别是利用MNIS型的外部数据模型)的大规模数据整合。