Proteins constitute a large group of macromolecules with a multitude of functions for all living organisms. Proteins achieve this by adopting distinct three-dimensional structures encoded by the sequence of their constituent amino acids in one or more polypeptides. In this paper, the statistical modelling of the protein backbone torsion angles is considered. Two new distributions are proposed for toroidal data by applying the M\"obius transformation to the bivariate von Mises distribution. Marginal and conditional distributions in addition to sine-skewed versions of the proposed models are also developed. Three big data sets consisting of bivariate information about protein domains are analysed to illustrate the strength of the flexible proposed models. Finally, a simulation study is done to evaluate the obtained maximum likelihood estimates and also to find the best method of generating samples from the proposed models to use as the proposal distributions in the Markov Chain Monte Carlo sampling method for predicting the 3D structure of proteins.
翻译:蛋白质构成一大批大型巨细胞,所有活生物体都具有多种功能。蛋白质通过采用一种或多种多元聚苯醚的成分氨基酸序列,以其成份氨基酸序列编码的三维结构实现这一点。本文件考虑了蛋白质主脊椎过氧化角的统计建模。通过将M\'obius转换到双子体Mises分布,为类固醇数据提出了两种新的分布。除拟议模型的正对版外,还开发了边际和有条件分布。分析了三套由蛋白质领域双变量信息组成的大数据集,以说明拟议灵活模型的强度。最后,进行了模拟研究,以评价获得的最大可能性估计数,并找到从拟议模型中生成样本的最佳方法,作为Markov链 Monte Carlo取样方法中预测3D蛋白质结构的分布建议。