Biomedical knowledge graphs permit an integrative computational approach to reasoning about biological systems. The nature of biological data leads to a graph structure that differs from those typically encountered in benchmarking datasets. To understand the implications this may have on the performance of reasoning algorithms, we conduct an empirical study based on the real-world task of drug repurposing. We formulate this task as a link prediction problem where both compounds and diseases correspond to entities in a knowledge graph. To overcome apparent weaknesses of existing algorithms, we propose a new method, PoLo, that combines policy-guided walks based on reinforcement learning with logical rules. These rules are integrated into the algorithm by using a novel reward function. We apply our method to Hetionet, which integrates biomedical information from 29 prominent bioinformatics databases. Our experiments show that our approach outperforms several state-of-the-art methods for link prediction while providing interpretability.
翻译:生物医学知识图表允许对生物系统的推理采用综合的计算方法。生物数据的性质导致一种与基准数据集通常遇到的不同的图表结构。为了了解这可能对推理算法的性能产生的影响,我们根据药物重新定位的现实世界任务进行一项实验性研究。我们将此任务设计成一种联系预测问题,即化合物和疾病与知识图中实体相对应。为了克服现有算法的明显弱点,我们提议了一种新方法,即Polo,将基于强化学习的政策引导行走与逻辑规则结合起来。这些规则通过使用新的奖励功能被纳入算法。我们用我们的方法将Hetionet纳入算法,将29个著名的生物信息信息数据库的生物学信息整合在一起。我们的实验表明,我们的方法在提供可解释性的同时,超越了几种最先进的联系预测方法。