Accurate identification of disease genes has consistently been one of the keys to decoding a disease's molecular mechanism. Most current approaches focus on constructing biological networks and utilizing machine learning, especially, deep learning to identify disease genes, but ignore the complex relations between entities in the biological knowledge graph. In this paper, we construct a biological knowledge graph centered on diseases and genes, and develop an end-to-end Knowledge graph completion model for Disease Gene Prediction using interactional tensor decomposition (called KDGene). KDGene introduces an interaction module between the embeddings of entities and relations to tensor decomposition, which can effectively enhance the information interaction in biological knowledge. Experimental results show that KDGene significantly outperforms state-of-the-art algorithms. Furthermore, the comprehensive biological analysis of the case of diabetes mellitus confirms KDGene's ability for identifying new and accurate candidate genes. This work proposes a scalable knowledge graph completion framework to identify disease candidate genes, from which the results are promising to provide valuable references for further wet experiments.
翻译:准确识别疾病基因始终是解码疾病分子机制的关键之一。当前大多数方法侧重于构建生物网络和利用机器学习,特别是深层学习以识别疾病基因,但忽视生物知识图中各实体之间的复杂关系。在本文中,我们构建了一个以疾病和基因为中心的生物知识图表,并开发了一个利用互动的色素分解(称为KDGene)进行疾病基因预测的端到端知识图完成模型。KDGene引入了一个实体嵌入和与高温分解关系之间的互动模块,这可以有效地加强生物知识中的信息互动。实验结果表明,KDGene明显地超越了最新的算法。此外,对糖尿病的综合性生物分析证实了KDGene在确定新和准确候选基因方面的能力。这项工作提出了一个可缩放知识图完成框架,以确定疾病候选基因,其结果有望为进一步的湿实验提供有价值的参考。</s>