项目名称: 基于异构网络的致病基因及其复合物预测方法研究
项目编号: No.61502071
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 计算机科学学科
项目作者: 徐博
作者单位: 大连理工大学
项目金额: 21万元
中文摘要: 致病基因的预测问题是人类健康领域面临的重大挑战之一, 寻找致病基因及相关复合物是认识疾病发生机理、研制疾病基因诊断与防治的重要基础。然而现有研究方法的性能受到蛋白质复合物信息有限、表型相似度计算依据不足、蛋白质关系中噪音数据的制约。因此本项目针对这三个方面的问题展开研究,设计基于多层网络模式转换策略并融合丰富资源特征的蛋白质关系网络重建算法,解决蛋白质关系中的噪音问题;提出基于属性网络的复合物识别算法,将生物属性与拓扑结构有机的结合从而提高复合物识别的性能;研究利用Skip-gram模型从海量未标生物医学文献中学习疾病表型的词向量从而计算表型相似度;最后研究利用这三个结果组成异构网络进行致病基因及相关复合物的预测,从根本上提高预测的性能。所识别致病基因及复合物将为生物学家提供有力参考,同时所研究的网络构建算法对基于网络研究的药物标靶发现等其他方面的相关研究具有一定的借鉴意义。
中文关键词: 致病基因;蛋白质复合体;蛋白质关系网络
英文摘要: Disease gene prediction is a big challenge in the field of human health. Finding disease gene and related protein complex is an important foundation to the understanding of disease mechanisms, prevention and genetic diagnosis. However, the performance of existing methods is limited by the constraints of protein complexes information, phenotype similarity and Protein-Protein Interactions(PPI) database. Therefore, our study is try to address these three problems. Firstly, incorporating rich biological sources reconstructs the PPI Network by designing multi-layer network conversion algorithm. Secondly, we propose a protein complexes identification algorithm based on attributed PPI network. This combination of biology properties and topology structure is helpful to performance. Thirdly, learns the vector of phenotype by Skip-gram model from the mass of unlabeled biomedical literatures and calculates their similarities. Finally, heterogeneous network will be composed by the above three results for predicting the disease genes and related complexes. The results of this project will provide a strong reference for biologists. Furthermore, the network reconstruction method also has great significance for drug target discovery and related research.
英文关键词: disease gene;protein complex;protein-protein interaction network