Bayesian phylogenetic inference is currently done via Markov chain Monte Carlo (MCMC) with simple proposal mechanisms. This hinders exploration efficiency and often requires long runs to deliver accurate posterior estimates. In this paper, we present an alternative approach: a variational framework for Bayesian phylogenetic analysis. We propose combining subsplit Bayesian networks, an expressive graphical model for tree topology distributions, and a structured amortization of the branch lengths over tree topologies for a suitable variational family of distributions. We train the variational approximation via stochastic gradient ascent and adopt gradient estimators for continuous and discrete variational parameters separately to deal with the composite latent space of phylogenetic models. We show that our variational approach provides competitive performance to MCMC, while requiring much less computation due to a more efficient exploration mechanism enabled by variational inference. Experiments on a benchmark of challenging real data Bayesian phylogenetic inference problems demonstrate the effectiveness and efficiency of our methods.
翻译:目前,通过Markov 链条Monte Carlo (MCMC) 以简单的建议机制对Bayesian 植物遗传推论进行计算,这妨碍了勘探效率,往往需要长长的距离才能提供准确的远地点估计。在本文中,我们提出了一个替代方法:Bayesian 植物遗传学分析的变异框架。我们提议将亚split Bayesian 网络、树木地形分布的直观图形模型以及树本系结构分层长度与树本系结构相容,以适合的可变分布组合。我们通过随机梯度升温来培训变异近点,并采用梯度估计器来分别处理连续和离散的变异参数,以分别处理植物遗传学模型的复合潜在空间。我们表明,我们的变异方法为MC提供了竞争性的性能,而由于变推论的更高效的勘探机制而要求的计算则要少得多。关于挑战真实数据Bayesian 植物遗传推论问题基准的实验显示了我们的方法的有效性和效率。