基于图示的人类基因类型和基因之间的联系预测 (Graph Based Link Prediction between Human Phenotypes and Genes)

Background The learning of genotype-phenotype associations and history of human disease by doing detailed and precise analysis of phenotypic abnormalities can be defined as deep phenotyping. To understand and detect this interaction between phenotype and genotype is a fundamental step when translating precision medicine to clinical practice. The recent advances in the field of machine learning is efficient to predict these interactions between abnormal human phenotypes and genes. Methods In this study, we developed a framework to predict links between human phenotype ontology (HPO) and genes. The annotation data from the heterogeneous knowledge resources i.e., orphanet, is used to parse human phenotype-gene associations. To generate the embeddings for the nodes (HPO & genes), an algorithm called node2vec was used. It performs node sampling on this graph based on random walks, then learns features over these sampled nodes to generate embeddings. These embeddings were used to perform the downstream task to predict the presence of the link between these nodes using 5 different supervised machine learning algorithms. Results: The downstream link prediction task shows that the Gradient Boosting Decision Tree based model (LightGBM) achieved an optimal AUROC 0.904 and AUCPR 0.784. In addition, LightGBM achieved an optimal weighted F1 score of 0.87. Compared to the other 4 methods LightGBM is able to find more accurate interaction/link between human phenotype & gene pairs.

翻译：通过详细和精确地分析胎儿异常现象,学习基因型同和人类疾病历史的背景。通过详细和精确地分析基因型同和人类疾病的历史,可以将不同知识资源(例如,孤儿)的批注数据定义为深层次的口交。要理解和检测这种芬型同基因型之间的相互作用,这是将精密医学转化为临床实践的一个基本步骤。在机器学习领域最近的进展对于预测异常人类芬型同基因之间的相互作用十分有效。在这个研究中,我们开发了一个框架,用来预测人类苯型本体与基因之间的联系。来自混杂知识资源(例如,孤儿)的批注数据可用于分析人类苯型同基因型同和基因型之间的相互作用。为了生成节点(HPO & 基因),使用了一种叫做 node2c 的算法。根据随机行走对这个图进行节点取样,然后学习这些抽样节点的特性,以产生嵌入。这些嵌入用于执行下游任务,以预测这些节点之间的精确链接,即,即,使用5个不同的甚低位的血型同级的血型同级的基因型同级关系。

相关内容

链路预测

关注 14

网络中的链路预测(Link Prediction)是指如何通过已知的网络节点以及网络结构等信息预测网络中尚未产生连边的两个节点之间产生链接的可能性。这种预测既包含了对未知链接（exist yet unknown links）的预测也包含了对未来链接（future links）的预测。该问题的研究在理论和应用两个方面都具有重要的意义和价值。

超越三元组:基于超关系知识图谱嵌入的链接预测，Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

专知会员服务

78+阅读 · 2020年5月11日

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

专知会员服务

122+阅读 · 2020年3月29日

17篇知识图谱Knowledge Graphs论文 @AAAI2020

专知会员服务

172+阅读 · 2020年2月13日