We revisit the classical problem of deriving convergence rates for the maximum likelihood estimator (MLE) in finite mixture models. The Wasserstein distance has become a standard loss function for the analysis of parameter estimation in these models, due in part to its ability to circumvent label switching and to accurately characterize the behaviour of fitted mixture components with vanishing weights. However, the Wasserstein distance is only able to capture the worst-case convergence rate among the remaining fitted mixture components. We demonstrate that when the log-likelihood function is penalized to discourage vanishing mixing weights, stronger loss functions can be derived to resolve this shortcoming of the Wasserstein distance. These new loss functions accurately capture the heterogeneity in convergence rates of fitted mixture components, and we use them to sharpen existing pointwise and uniform convergence rates in various classes of mixture models. In particular, these results imply that a subset of the components of the penalized MLE typically converge significantly faster than could have been anticipated from past work. We further show that some of these conclusions extend to the traditional MLE. Our theoretical findings are supported by a simulation study to illustrate these improved convergence rates.
翻译:我们重新审视了有限混合物模型中最大可能性估计值(MLE)的趋同率的典型问题。 瓦森斯坦距离已成为分析这些模型参数估计的标准损失函数,部分是由于它能够绕过标签切换和准确辨别具有消散重量的装配混合物组成部分的行为,但是,瓦森斯坦距离只能捕捉其余装配混合物组成部分中最差的趋同率。我们证明,当对日志相似性功能进行约束以阻止消散混合重量时,可以得出更强大的损失函数,以解决瓦瑟斯坦距离的这一缺陷。这些新的损失函数准确地捕捉了装配混合物组成部分汇合率的异质性,我们利用这些新损失函数来提高不同类别混合物模型中现有的点和统一趋同率。 特别是,这些结果表明,受罚的MLE组成部分的组群通常比以往工作预期的要快得多。我们进一步表明,这些结论中的某些部分延伸到传统的MLE。我们的理论结论得到模拟研究的支持,以说明这些改进的趋同率。