蛋白质模型比较比较器:谷歌App引擎平台上的可缩放生物信息计算机 (Protein Models Comparator: Scalable Bioinformatics Computing on the Google App Engine Platform)

The comparison of computer generated protein structural models is an important element of protein structure prediction. It has many uses including model quality evaluation, selection of the final models from a large set of candidates or optimisation of parameters of energy functions used in template-free modelling and refinement. Although many protein comparison methods are available online on numerous web servers, they are not well suited for large scale model comparison: (1) they operate with methods designed to compare actual proteins, not the models of the same protein, (2) majority of them offer only a single pairwise structural comparison and are unable to scale up to a required order of thousands of comparisons. To bridge the gap between the protein and model structure comparison we have developed the Protein Models Comparator (pm-cmp). To be able to deliver the scalability on demand and handle large comparison experiments the pm-cmp was implemented "in the cloud". Protein Models Comparator is a scalable web application for a fast distributed comparison of protein models with RMSD, GDT TS, TM-score and Q-score measures. It runs on the Google App Engine (GAE) cloud platform and is a showcase of how the emerging PaaS (Platform as a Service) technology could be used to simplify the development of scalable bioinformatics services. The functionality of pm-cmp is accessible through API which allows a full automation of the experiment submission and results retrieval. Protein Models Comparator is free software released on the Affero GNU Public Licence and is available with its source code at: http://www.infobiotics.org/pm-cmp This article presents a new web application addressing the need for a large-scale model-specific protein structure comparison and provides an insight into the GAE (Google App Engine) platform and its usefulness in scientific computing.

翻译：计算机生成的蛋白质结构模型的比较是蛋白质结构预测的一个重要要素。它有许多用途, 包括模型质量评价、从大批候选人中选择最终模型或优化在无模板建模和完善中使用的能源功能参数。虽然许多网络服务器上都有许多蛋白质比较方法, 但它们并不完全适合大规模模型比较:(1) 它们使用旨在比较实际蛋白的方法运作, 而不是同一蛋白的模型, (2) 它们大多只提供单一对称结构比较, 无法达到所要求的数千个比较顺序。为了缩小蛋白质和模型结构比较之间的差距, 我们已经开发了Proteffer In 模型比较器。虽然许多蛋白质比较方法都可以在网络服务器上在线使用, 但是它们并不适合于大规模比较实际蛋白质模型, 而不是同一蛋白模型的模型, GDTT、 TM- 代码和 Q- 数位量测量测量器。它在GGoogle Proteal- developeral Complia Coal Complical Complical 上运行了OI 平台, 它用来展示一个正在开始的模型, 它的模型, 它的模型的模型的模型是用来展示系统, 它的系统, 它的模型的系统,它正在展示一个正在通过一个正在展示的模型, 它的模型, 展示一个正在演示的系统, 展示一个正在展示一个正在形成一个正在展示的系统。