项目名称: 蛋白质-蛋白质对接的计算模型研究
项目编号: No.30870474
项目类型: 面上项目
立项/批准年度: 2009
项目学科: 生物科学
项目作者: 雷红星
作者单位: 中国科学院北京基因组研究所
项目金额: 30万元
中文摘要: 蛋白质-蛋白质对接是计算结构生物学的重要研究方向.在经过多年的研究以后,现在仍然存在两大难点,即如何有效地评价对接软件计算得到的复合物,以及如何处理对接过程中蛋白质柔韧性的问题.本项目对这两大问题特别是第一个问题进行了广泛深入地探索.我们基于最新的BENCHMARK 4对不借助于实验信息的打分函数进行了全面的评价. 我们发现DDFIRE总能量,DDFIRE结合面能量,结合面大小,和ZDOCK分数的整体表现相当,而且将这四个函数简单的结合可以得到比任何一个单一函数要好很多的效果. 我们评价了在打分前2000的复合物中是否存在"好"的复合物.对于总共147个测试的靶标,这个复合打分函数只在13个靶标上未能实现成功打分.相比而言,国际上比较好的ZDOCK分数在22个靶标上未能实现成功打分. 通过更进一步分析这13个靶标我们发现,其中绝大部分的候选群质量太低,其中8个靶标的候选群只有2-34个"好"的复合物. 因此,对于候选群不是太差的对接结果来说,这个复合打分函数都是可以较好的完成打分任务的. 除此之外,结合面的采样频率和复合物的总体大小在过滤"差"的复合物上比上述四个打分函数表现较好一些.
中文关键词: 对接;打分函数;排序;过滤;复合物
英文摘要: After many years of development, the greatest challenges in the field of protein-protein docking remain to be two fold, ranking the candidates effectively and dealing with conformational changes during docking. For the first challenge, increasing effort has been devoted to incorporate more experimental information regarding the binding interface. However, this will not help us understand the mechanism by which proteins interact with each other. Neither will it guide us towards prediction of novel protein interactions. In this work, we have attempted to explore the limit of various scoring functions in the absence of any information regarding the binding interface. These scoring functions cover the energetic and geometric aspects, including DDFIRE total energy, DDFIRE interface energy, interface size, interface frequency, overall packing and ZDOCK score. We used the recently available ZDOCK decoy sets for Benchmark 4 (54000 candidates for each target) to evaluate the ranking and filtering capability of each scoring function. We have demonstrated that the first three scoring functions have comparable and complementary performance with ZDOCK score. A simple combination of these four scores can lead to significantly enhanced performance. For the 147 targets evaluated, the composite scoring function failed on only 13 targets to retain at least one acceptable candidate among the top 2000 selections, compared to 22 failed targets by ZDOCK score alone. Furthermore, there were only 2-35 acceptable candidates in the original candidate pool for 8 of these 13 failed targets, suggesting that the composite scoring function works well when reasonable amount of good candidates is available. In addition, the overall packing performed exceptionally well in filtering out the bottom 5000 candidates for 110 of the 143 targets without losing a single good candidate, suggesting its important role in protein-protein interaction. This comprehensive evaluation will better clarify the strength and weakness of the individual scoring functions.
英文关键词: docking;scoring function; ranking; filtering;complex