In Bayesian density estimation, a question of interest is how the number of components in a finite mixture model grows with the number of observations. We provide a novel perspective on this question by using results from stochastic geometry to find that the growth rate of the expected number of components of a finite mixture model whose components belong to the unit simplex $\Delta^{J-1}$ of the Euclidean space $\mathbb{R}^J$ is $(\log n)^{J-1}$. We also provide a central limit theorem for the number of components. In addition, we relate our model to a classical non-parametric density estimator based on a P\'olya tree. Combining this latter with techniques from Choquet theory, we are able to retrieve mixture weights. We also give the rate of convergence of the P\'olya tree posterior to the Dirac measure on the weights. We further present an algorithm to correctly specify the number of components in a latent Dirichlet allocation (LDA) analysis.
翻译:在Bayesian 密度估计中,一个令人感兴趣的问题是有限混合物模型的成分数量如何随着观测次数的增加而增加。我们通过使用随机几何测量结果,对这个问题提供了一种新的视角,以发现一定混合物模型的成分的预期数量增长率,该混合物模型的成分属于Euclidean空间单位$\mathbb{R ⁇ J$的简单x$\Delta ⁇ J-1}美元。我们还为成分数量提供了一个中心限值理论。此外,我们将我们的模型与基于P'olya树的经典非参数密度估计仪联系起来,将后者与Choquet理论的技术结合起来,我们可以检索混合物的重量。我们还给出了P\'olya树尾部与Dirac重量测量值的趋同率。我们还提出了一种算法,以正确指定潜伏dirichlet(LDA)分析中的成分数量。