在Bayesian群集分析中监视数据组数和分区分布的先前数据组数 (Spying on the prior of the number of data clusters and the partition distribution in Bayesian cluster analysis)

Mixture models represent the key modelling approach for Bayesian cluster analysis. Different likelihood and prior specifications are required to capture the prototypical shape of the clusters. In addition, the mixture modelling approaches also crucially differ in the specification of the prior on the number of components and the prior on the component weight distribution. We investigate how these specifications impact on the implicitly induced prior on the number of 'filled' components, i.e., data clusters, and the prior on the partitions. We derive computationally feasible calculations to obtain these implicit priors for reasonable data analysis settings and make a reference implementation available in the R package 'fipp'. In many applications the implicit priors are of more practical relevance than the explicit priors imposed and thus suitable prior specifications depend on the implicit priors induced. We highlight the insights which may be gained from inspecting these implicit priors by analysing them for three different modelling approaches previously proposed for Bayesian cluster analysis. These modelling approaches consist of the Dirichlet process mixture and the static and dynamic mixture of finite mixtures model. The default priors suggested in the literature for these modelling approaches are used and the induced priors compared. Based on the implicit priors, we discuss the suitability of these modelling approaches and prior specifications when aiming at sparse cluster solutions and flexibility in the prior on the partitions.

翻译：混合模型代表了巴伊西亚群集分析的关键建模方法。为了捕捉这些群集的原型形状,需要不同的可能性和先前的规格。此外,混合模型方法在以前关于组件数量和组成部分重量分布的事先规格方面也有重大差异。我们调查这些规格如何影响先前暗含的“填充”组件数量,即数据组和分区之前的模拟方法。我们从计算上得出可行的计算方法,以便为合理的数据分析设置获取这些隐含的前缀,并在R包“fipp”中提供参考执行。在许多应用中,隐含的前缀比明确的前缀具有更实际的相关性,因此以前的适当规格取决于隐含的前缀。我们强调这些规格通过分析这些隐含的前缀对以前为巴伊西亚群分析提出的三种不同的建模方法,即数据组别。这些建模方法包括Drichlet工艺混合物和定型混合物的静态和动态混合物模型。我们使用这些建模方法的默认前缀,并比较了这些前缀方法。这些隐含的先订方法比明确的前订方法更具有实际意义,因此取决于隐含的先前的先前的规格。我们之前的变变的变的变的模型,我们根据了这些前的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变,在了这些变的变的变的变的变的变的变的变的变的变的变的变的变,在了这些变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变。在的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变的变