基于随机图模型的蛋白质三级结构预测算法研究

项目名称： 基于随机图模型的蛋白质三级结构预测算法研究

项目编号： No.30800168

项目类型： 青年科学基金项目

立项/批准年度： 2009

项目学科： 金属学与金属工艺

项目作者： 卜东波

作者单位： 中国科学院计算技术研究所

项目金额： 20万元

中文摘要： 蛋白质三级结构预测是计算生物学的重要问题。由于对相互作用处理不够完善等原因,现有算法对于长的、以及 Beta类蛋白预测精度较低。针对上述不足,本课题沿着"估计残基二面角偏好,并根据相互作用预测二面角"的路线, 首先采用线性规划技术来预测所有长为 9的短片断的可能结构,不仅考虑 profile 等序列特征,同时还考虑溶液可及性等结构特征,提高预测准确度; 接着以预测出的片断结构为基础,对每个残基使用 vonMises分布来刻画其二面角 phi/psi的偏好,并使用迭代技术不断提高对二面角的预测准确度; 最后根据预测的近程/远程相互作用,构建 Markov随机场模型,不仅能够预测出phi/psi等二面角信息，同时还能够预测出目标蛋白质与模板之间的联配。这样既能够有效利用近程/远程相互作用信息,又能够克服 MonteCarlo方法搜索空间过大的缺点。在国际蛋白质结构预测比赛CASP中，我们检验了上述算法和软件包，实验结果表明了本项工作的创新性和有效性。

中文关键词： 蛋白质三级结构;随机图模型;远程相互作用;整数线性规划

英文摘要： Protein structure prediction is an important problem in the field of computational biology. Due to the ineffectiveness in distant interaction description, the existing methods show low prediction accuracy for long or beta proteins. We follow the "estimating angle distribution, then predict phi/psi according to interactions" strategy. Briefly, we first predict possible local structures for all 9-mer protein sequence fragments using linear programming technique, where sequence profile, solvent accessibility area, and other structural information are encoded to improve prediction accuracy. Second, based on the predicted local structures, we employed von Mises technique to describe the distribution of phi/psi angle, and then use importance sampling technique to improve the prediction gradually. Finally, we build conditional random fields, including profile CRFs, chain CRFs, and TreeCRFs, to predict protein-protein alignment, or the phi/psi angles for all amino acids directly. Experimental results suggest that the CRF model can effectively improve protein structure prediction.

英文关键词： protein structure prediction; linear programming; graphical model; linear programming

成为VIP会员查看完整内容