Proteins structure prediction has long been a grand challenge over the past 50 years, owing to its board scientific and application interests. There are two major types of modelling algorithm, template-free modelling and template-based modelling, which is suitable for easy prediction tasks, and is widely adopted in computer aided drug discoveries for drug design and screening. Although it has been several decades since its first edition, the current template-based modeling approach suffers from two important problems: 1) there are many missing regions in the template-query sequence alignment, and 2) the accuracy of the distance pairs from different regions of the template varies, and this information is not well introduced into the modeling. To solve the two problems, we propose a structural optimization process based on template modelling, introducing two neural network models predict the distance information of the missing regions and the accuracy of the distance pairs of different regions in the template modeling structure. The predicted distances and residue pairwise specific accuracy information are incorporated into the potential energy function for structural optimization, which significantly improves the qualities of the original template modelling decoys.
翻译:在过去50年中,蛋白质结构预测长期以来一直是一个巨大的挑战,因为其理事会具有科学和应用兴趣。有两大类建模算法,即无模板建模和基于模板的建模模型,适合于简单的预测任务,在计算机辅助药物发现中被广泛采用,用于药物设计和筛选。尽管自其第一版以来已有几十年,但目前基于模板的建模方法存在两个重要问题:1)模板-查询序列对齐存在许多缺失的区域,2)模板不同区域的相距对的准确性各不相同,而且这一信息没有很好地引入到建模中。为了解决这两个问题,我们提出了一个基于模板建模的结构优化进程,引入了两个神经网络模型,预测缺失区域的距离信息以及模板建模结构中不同区域的相距对的准确性。预测的距离和残余的具体准确性信息被纳入了结构优化的潜在能源功能中,这大大改进了原模板建模的质。