The influence of Dirichlet process mixture is ubiquitous in the Bayesian nonparametrics literature. But sampling from its posterior distribution remains a challenge, despite the advent of various Markov chain Monte Carlo methods. The primary challenge is the infinite-dimensional setup, and even if the infinite-dimensional random measure is integrated out, high-dimensionality and discreteness still remain difficult issues to deal with. In this article, exploiting the key ideas proposed in Bhattacharya (2021b), we propose a novel methodology for drawing iid realizations from posteriors of Dirichlet process mixtures. We focus in particular on the more general and flexible model of Bhattacharya (2008), so that the methods developed here are simply applicable to the traditional Dirichlet process mixture. We illustrate our ideas on the well-known enzyme, acidity and the galaxy datasets, which are usually considered benchmark datasets for mixture applications. Generating 10, 000 iid realizations from the Dirichlet process mixture posterior of Bhattacharya (2008) given these datasets took 19 minutes, 8 minutes and 5 minutes, respectively, in our parallel implementation.
翻译:Drichlet工艺混合物的影响在巴伊西亚非参数文献中普遍存在。但是,尽管出现了各种Markov链的蒙特卡洛方法,但是其后方分布的取样仍然是一项挑战。主要挑战在于无限的维度设置,即使无限的随机测量数据被整合出来,高维性和离散性仍然是有待处理的困难问题。在本篇文章中,我们利用Bhattacharya(2021b)中提出的关键理念,提出了从Drichlet工艺混合物的后方采集离子体认识的新方法。我们特别侧重于Bhattacharya(2008年)的更为一般和灵活的模型,因此这里开发的方法仅适用于传统的Drichlet工艺混合物。我们介绍了我们对众所周知的酶、酸性和星系数据集的想法,这些数据集通常被视为混合物应用的基准数据集。我们平行实施过程中,利用这些数据集分别用了19分钟、8分钟和5分钟(2008年)的时间,从Bhattharya的Drichlet工艺混合物后方位生成了10 000 iid认识。