This work focuses on clustering populations with a hierarchical dependency structure that can be described by a tree. A particular example that is the focus of our work is the phylogenetic tree, with nodes often representing biological species. Clustering of the populations in this problem is equivalent to identify branches in the tree where the populations at the parent and child node have significantly different distributions. We construct a nonparametric Bayesian model based on hierarchical Pitman-Yor and Poisson processes to exploit this hierarchical structure, with a key contribution being the ability to share statistical information between subpopulations. We develop an efficient particle MCMC algorithm to address computational challenges involved with posterior inference. We illustrate the efficacy of our proposed approach on both synthetic and real-world problems.
翻译:这项工作的重点是将具有可被一棵树描述的上层依赖结构的人口聚在一起。我们工作的一个特别例子是植物基因树,其节点往往代表生物物种。将这一问题下的人口聚在一起,相当于确定树上的分支,即父母和儿童节点上的人口分布大不相同。我们根据Pitman-Yor和Poisson的等级程序构建一种非对称的巴伊西亚模型,以利用这一等级结构,其中一项关键贡献是能够在亚群体之间分享统计信息。我们开发了高效的粒子MCMC算法,以解决与后方推推论有关的计算挑战。我们说明了我们所提议的在合成和现实世界问题上的方法的有效性。</s>