We revisit convergence rates for maximum likelihood estimation (MLE) under finite mixture models. The Wasserstein distance has become a standard loss function for the analysis of parameter estimation in these models, due in part to its ability to circumvent label switching and to accurately characterize the behaviour of fitted mixture components with vanishing weights. However, the Wasserstein metric is only able to capture the worst-case convergence rate among the remaining fitted mixture components. We demonstrate that when the log-likelihood function is penalized to discourage vanishing mixing weights, stronger loss functions can be derived to resolve this shortcoming of the Wasserstein distance. These new loss functions accurately capture the heterogeneity in convergence rates of fitted mixture components, and we use them to sharpen existing pointwise and uniform convergence rates in various classes of mixture models. In particular, these results imply that a subset of the components of the penalized MLE typically converge significantly faster than could have been anticipated from past work. We further show that some of these conclusions extend to the traditional MLE. Our theoretical findings are supported by a simulation study to illustrate these improved convergence rates.
翻译:在有限的混合物模型中,我们重新审视了最大可能性估算的趋同率。瓦塞斯特因距离已成为分析这些模型参数估算的标准损失函数,部分原因是它能够绕过标签切换和准确辨别具有消散重量的装配混合物组件的行为特征。然而,瓦塞斯特因指标只能捕捉其余装配混合物组件中最差的趋同率。我们证明,当日志相似性功能受到抑制以阻止消散混合重量时,可以产生更强大的损失函数,以解决瓦塞尔斯坦距离的这一缺陷。这些新的损失函数准确地捕捉到装配混合物组件汇合率的异质性,我们利用这些新损失函数来强化不同类别混合物模型中现有的点和统一趋同率。特别是,这些结果表明,受罚的MLE的一组组件通常比以往工作所预期的要快得多。我们进一步表明,其中一些结论延伸到传统的MLE。我们理论结论得到模拟研究的支持,以说明这些经改进的趋同率。