项目名称: 基于多生物网络的蛋白质功能预测算法研究
项目编号: No.61502214
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 计算机科学学科
项目作者: 彭玮
作者单位: 昆明理工大学
项目金额: 19万元
中文摘要: 随着基因组序列数据和其功能注释数据之间差距的日益增大,高效的蛋白质功能预测方法成为后基因组时代的研究热点。考虑到细胞功能的系统性、复杂性及多样性,设计合适的模型和方法来融合多元的生物信息是蛋白质功能预测研究面临巨大的挑战。本项目构建多个生物网络来描述多种生物信息之间以及他们与功能之间的关联关系。基于多生物网络,研究蛋白质功能预测的新方法。首先分析各个网络的特征、网络之间以及网络与功能之间的关系。考虑到生物网络的层次性和异构性采用多网络比对和多重扩散的方法来预测蛋白质功能。考虑到生物网络的动态性,在构建好的动态网络上,识别即时机制的蛋白质复合物和挖掘动态蛋白质子网来预测蛋白质功能。最后用机器学习的多层次学习框架集成多生物网络来预测的蛋白质功能。研究中涉及的数据、研究成果以及算法实现将会整合到一个统一的信息平台上来,实现多网络的构建、分析以及蛋白质功能预测,方便生物学研究和医学研究人员使用。
中文关键词: 蛋白质互作网络;生物学功能;基因功能
英文摘要: Accurate annotation of protein functions plays a significant role in understanding life at the molecular level. Next-generation high-throughput DNA sequencing techniques generate a large number of genome data. The gap between available sequence data and their functional annotations has been increasingly widening. Therefore protein function prediction is still hot area of research in post-genomic era. However designing effective methods to combine multiple biological information is still a big challenge for protein function prediction due to systematicness,complexity and diversity of cell functions. In this project, we look to go beyond traditional machine learning-based methods and employ multiple-network technique to combine various biological data sources. We firstly deeply analyze the features of multiple biological data sources and the relationship between these data. Novel algorithms will be proposed to construct multiple biological networks. The feature of these biological networks and the relationships with protein functions will be studied. With respect to the hierarchy and difference of those biological networks, we will adapt the algorithms of multi-network alignment and of multi-diffusion to predict protein functions. With respect the dynamics of those biological networks, we will identify dynamic protein modules from dynamic protein networks to infer protein functions. Finally, a machine learning methods of multi-layer learning framework will be employed to integrate multiple biological networks so as to improve the prediction accuracy of protein function. It is hopeful that a serial of hard problems in the multiple network-based protein function prediction will be attacked in context of complicated cellular mechanism, which helps us better understand the mechanism of cell life at system level. A unified information access platform will be constructed on the base of biological data, research results and executable programs of this project. It will provide valuable information for biological and medical researchers.
英文关键词: protein-protein interaction network;biological function ;gene function