Dynamic programming approaches have long been applied to fit models of univariate and multivariate trait evolution on phylogenetic trees for discrete and continuous traits, and more recently adapted to phylogenetic networks with reticulation. We previously showed that various trait evolution models on a network can be readily cast as probabilistic graphical models, so that likelihood-based estimation can proceed efficiently via belief propagation on an associated clique tree. Even so, exact likelihood inference can grow computationally prohibitive for large complex networks. Loopy belief propagation can similarly be applied to these settings, using non-tree cluster graphs to optimize a factored energy approximation to the log-likelihood, and may provide a more practical trade-off between estimation accuracy and runtime. However, the influence of cluster graph structure on this trade-off is not precisely understood. We conduct a simulation study using the Julia package PhyloGaussianBeliefProp to investigate how varying maximum cluster size affects this trade-off for Gaussian trait evolution models on networks. We discuss recommended choices for maximum cluster size, and prove the equivalence of likelihood-based and factored-energy-based parameter estimates for the homogeneous Brownian motion model.
翻译:动态规划方法长期以来被应用于拟合系统发育树上离散与连续性状的单变量及多变量进化模型,近期更被扩展至具有网状结构的系统发育网络。我们先前研究表明,网络上的多种性状进化模型可便捷地转化为概率图模型,从而通过关联团树上的置信传播实现高效的基于似然性的估计。即便如此,对于大型复杂网络,精确似然推断的计算负担可能变得难以承受。环形置信传播同样可适用于这些场景——通过使用非树形聚类图来优化对数似然函数的分解能量近似,这可能在估计精度与运行时间之间提供更实用的权衡。然而,聚类图结构对此权衡关系的影响尚未被精确理解。我们利用Julia软件包PhyloGaussianBeliefProp开展模拟研究,探究在网络的高斯性状进化模型中,改变最大聚类规模如何影响该权衡关系。我们讨论了最大聚类规模的推荐选择,并证明了同质布朗运动模型中基于似然性与基于分解能量的参数估计的等价性。