When a population exhibits heterogeneity, we often model it via a finite mixture: decompose it into several different but homogeneous subpopulations. Contemporary practice favors learning the mixtures by maximizing the likelihood for statistical efficiency and the convenient EM-algorithm for numerical computation. Yet the maximum likelihood estimate (MLE) is not well defined for the most widely used finite normal mixture in particular and for finite location-scale mixture in general. We hence investigate feasible alternatives to MLE such as minimum distance estimators. Recently, the Wasserstein distance has drawn increased attention in the machine learning community. It has intuitive geometric interpretation and is successfully employed in many new applications. Do we gain anything by learning finite location-scale mixtures via a minimum Wasserstein distance estimator (MWDE)? This paper investigates this possibility in several respects. We find that the MWDE is consistent and derive a numerical solution under finite location-scale mixtures. We study its robustness against outliers and mild model mis-specifications. Our moderate scaled simulation study shows the MWDE suffers some efficiency loss against a penalized version of MLE in general without noticeable gain in robustness. We reaffirm the general superiority of the likelihood based learning strategies even for the non-regular finite location-scale mixtures.
翻译:当一个人口表现出异质性时,我们往往通过一种有限的混合物来模拟它:将它分解成若干不同但同质的子群。当代实践倾向于通过最大限度地提高统计效率的可能性和方便的计算数字的 EM-algoritm 来学习混合物。然而,对于最广泛使用的有限正常混合物和一般的有限位置级混合物来说,最大可能性估计(MLE)没有很好地界定。因此,我们调查了最低距离测量器等MLE的可行替代品。最近,瓦瑟斯坦距离在机器学习界引起了更多的注意。它具有直观的几何几何解释,并成功地应用于许多新的应用中。我们是否通过最低限度的瓦塞斯坦距离估计器(MWDE)学习有限位置级混合物而获得任何好处?本文从几个方面研究了这种可能性。我们发现,MLEDE是一致的,在有限位置级混合物下得出一个数字解决方案。我们研究了它是否强大,避免外缘和温和模型的错误特性。我们适度的模拟研究表明,MLEDE在很多新的应用应用中会遇到一些效率损失。我们不固定的MLE战略,我们重申一般的常态的常态的常态学习。