项目名称: 应用机器学习方法预测和分析蛋白质的结构柔性
项目编号: No.61003187
项目类型: 青年科学基金项目
立项/批准年度: 2011
项目学科: 金属学与金属工艺
项目作者: 张华
作者单位: 浙江工商大学
项目金额: 7万元
中文摘要: 蛋白质的结构柔性与动力学特征是蛋白质完成各种生物功能的基础,对它的研究是蛋白质组计划的一个重要组成部分。随着蛋白质结构测定技术的不断完善,大量由实验获得的柔性数据不断涌现,其中包括 B-因子,构象变化,无序区域,序参数,化学位移,保护因子等,清楚地解释这批柔性数据的生物学机制还需很长的路要走,不需要实验而成本较低的一种途径是从蛋白质序列或结构出发,利用机器学习方法来预测蛋白质的柔性。虽然已有不少研究者关注于蛋白质的柔性预测,但目前预测的精度还普遍较低,并且还未有学者对这些柔性度量进行过系统的预测和分析。 本项目利用机器学习方法预测蛋白质的柔性度量,通过新特征设计与特征选择,提高预测精度。另外还利用机器学习方法开发出基于序列的高斯网络模型。蛋白质柔性的精确预测可以为进一步的结构预测和功能与药物设计提供可靠的理论基础。
中文关键词: 蛋白质柔性;B-因子;无序区域;机器学习;高斯网络模型
英文摘要: The flexibility and dynamics of proteins is essential for implementing their various functions, and the research on the dynamics of proteins is now a important component in the proteomic project. With the advance of the determining techniques of protein structures, a great number of types of data reflecting the protein flexibility occur, including B-factor, conformational change, disordered region, order parameter, chemical shift and protection factor. It is still a challenge to explain the biological mechanics of these data about the flexibility. Although there is an extensive focus on the flexibility predictions, the current prediction accuacy is low and nobody has provide a systemic prediction and analysis for the flexibility measures. This project accurately predicted the flexibility measures of protein using machine learning methods, which also included the new feature design and feature selection. In addition, we developed a seqence-based Gaussian network model for protein dynamics based on the contact map predicted by machine learning methods. Accurate predictions of protein flexibility could provide knowledge for the predictions of protein structures and the rational design of drugs.
英文关键词: protein flexibility; B-factor; disordered regions; machine learning; Gaussian network model