有限混合物通用混合物和遥测抽样 (Generalized mixtures of finite mixtures and telescoping sampling)

Within a Bayesian framework, a comprehensive investigation of mixtures of finite mixtures (MFMs), i.e., finite mixtures with a prior on the number of components, is performed. This model class has applications in model-based clustering as well as for semi-parametric density estimation and requires suitable prior specifications and inference methods to exploit its full potential. We contribute by considering a generalized class of MFMs where the hyperparameter $\gamma_K$ of a symmetric Dirichlet prior on the weight distribution depends on the number of components. We show that this model class may be regarded as a Bayesian non-parametric mixture outside the class of Gibbs-type priors. We emphasize the distinction between the number of components $K$ of a mixture and the number of clusters $K_+$, i.e., the number of filled components given the data. In the MFM model, $K_+$ is a random variable and its prior depends on the prior on $K$ and on the hyperparameter $\gamma_K$. We employ a flexible prior distribution for the number of components $K$ and derive the corresponding prior on the number of clusters $K_+$ for generalized MFMs. For posterior inference, we propose the novel telescoping sampler which allows Bayesian inference for mixtures with arbitrary component distributions without resorting to reversible jump Markov chain Monte Carlo (MCMC) methods. The telescoping sampler explicitly samples the number of components, but otherwise requires only the usual MCMC steps of a finite mixture model. The ease of its application using different component distributions is demonstrated on several data sets.

翻译：在Bayesian框架内,对固定混合物(MFMM)混合物(即具有先期成分数量的限量混合物)进行全面调查,在Bayesian框架内,对固定混合物(即具有先期成分数量的限量混合物)进行全面调查。这一模型类在基于模型的集群和半参数密度估计中适用,需要适当的事先规格和推断方法来充分发挥其潜力。我们通过考虑一个通用的MFM类别作出贡献,在这种类别中,一个对称分数分布之前的超参数Drichlet$\gamma_K$取决于重量分布的样本数量。我们显示,这一模型类可被视为Gibbs型混合物类别以外的一种非链级非参数混合物。我们强调一种混合物的成分数量和半参数密度估算值之间的差别,也就是说,根据数据提供的填充装的成分数量。在MFMMMMM公司模型中, 美元是一个随机变量,其先前的变量取决于之前的基数,而超基准值美元,但Kammamas 。我们采用灵活的前期分配方法,在SBleveral Recreal Reval Reval mission mission 中,我们没有提出一个常规比例数据。