Phylogenetic trait evolution models allow for the estimation of evolutionary correlations between a set of traits observed in a sample of related organisms. By directly modeling the evolution of the traits along an estimable phylogenetic tree, the model's structure effectively controls for shared evolutionary history. In these models, relevant correlations are usually assessed through the high posterior density interval of their marginal distributions. However, the selected correlations alone may not provide the full picture regarding trait relationships. Their association structure, expressed through a graph that encodes partial correlations, can in contrast highlight sparsity patterns featuring direct associations between traits. In order to develop a model-based method to identify this association structure we explore the use of Gaussian graphical models (GGM) for covariance selection. We model the precision matrix with a G-Wishart conjugate prior, which results in sparse precision estimates. Furthermore the model naturally allows for Bayes Factor tests of association between the traits, with no additional computation required. We evaluate our approach through Monte Carlo simulations and applications that examine the association structure and evolutionary correlations of phenotypic traits in Darwin's finches and genomic and phenotypic traits in prokaryotes. Our approach provides accurate graph estimates and lower errors for the precision and correlation parameter estimates, particularly for conditionally independent traits, which are the target for sparsity in GGMs.
翻译:基因进化模型可以估计在一组相关生物样本中观察到的一系列特征之间的进化相关性。通过直接模拟在可估量的植物基因树上特征的演变,模型的结构能够有效地控制共同进化历史。在这些模型中,相关的关联通常通过其边际分布的高后端密度间隔来评估。然而,所选择的关联本身可能无法提供关于特征关系的全面图象。它们的联系结构,通过一个图表来编码部分相关性,可以对比地突出特征之间直接关联的聚变模式。为了开发一种基于模型的方法来确定这种关联结构,我们探索使用高斯的图形模型(GGGGM)来选择共变异历史。我们用G-Wishart相形模型之前的精确矩阵模型来评估其边际分布,从而得出少的精确估计。此外,模型自然允许对各种特征之间的关联进行巴伊斯系数测试,而不需要额外的计算。我们通过蒙特卡洛的模拟和应用来评估我们的方法,通过这些模型来研究关联结构以及进化的关联性关联性关系,以及我们用于在DNA的直径基和直系的直径直径直系的直系和直径直径直径直系估算。