Posterior computation in hierarchical Dirichlet process (HDP) mixture models is an active area of research in nonparametric Bayes inference of grouped data. Existing literature almost exclusively focuses on the Chinese restaurant franchise (CRF) analogy of the marginal distribution of the parameters, which can mix poorly and is known to have a linear complexity with the sample size. A recently developed slice sampler allows for efficient blocked updates of the parameters, but is shown to be statistically unstable in our article. We develop a blocked Gibbs sampler to sample from the posterior distribution of HDP, which produces statistically stable results, is highly scalable with respect to sample size, and is shown to have good mixing. The heart of the construction is to endow the shared concentration parameter with an appropriately chosen gamma prior that allows us to break the dependence of the shared mixing proportions and permits independent updates of certain log-concave random variables in a block. En route, we develop an efficient rejection sampler for these random variables leveraging piece-wise tangent-line approximations.
翻译:在分组数据的非参数贝叶斯推断中,层次狄利克雷过程(HDP)混合模型的后验计算是一个活跃的研究领域。现有的文献几乎完全集中于参数边缘分布的中国餐馆分支(CRF)类比,它的混合通常效果不佳,并且已知其与样本大小具有线性复杂度。最近开发的切片抽样器允许参数的有效阻塞更新,但在我们的文章中被证明是统计不稳定的。我们开发了一个阻塞吉布斯抽样器来从HDP的后验分布中进行抽样,它产生了统计稳定的结果,在样本大小方面高度可伸缩,并且具有良好的混合性。构建的核心是给共享浓度参数赋予一个适当选择的伽马先验,它允许我们打破共享混合比例的依赖关系,并允许对块中某些对数凹随机变量进行独立更新。在途中,我们利用分段切线逼近开发了一个有效的拒绝抽样器用于这些随机变量。