The Gaussian mixture model (GMM) provides a convenient yet principled framework for clustering, with properties suitable for statistical inference. In this paper, we propose a new model-based clustering algorithm, called EGMM (evidential GMM), in the theoretical framework of belief functions to better characterize cluster-membership uncertainty. With a mass function representing the cluster membership of each object, the evidential Gaussian mixture distribution composed of the components over the powerset of the desired clusters is proposed to model the entire dataset. The parameters in EGMM are estimated by a specially designed Expectation-Maximization (EM) algorithm. A validity index allowing automatic determination of the proper number of clusters is also provided. The proposed EGMM is as convenient as the classical GMM, but can generate a more informative evidential partition for the considered dataset. Experiments with synthetic and real datasets demonstrate the good performance of the proposed method as compared with some other prototype-based and model-based clustering techniques.
翻译:高斯混合模型(GMM)为集群提供了一个方便而又有原则的框架,其属性适合统计推断。在本文中,我们建议在信仰功能理论框架内提出一个新的基于模型的群集算法,称为EMM(明显GM),以更好地确定群集成员的不确定性。有了代表每个物体组群组成的质量功能,建议由各组群各组成部分组成的证据高斯混合分布模式来模拟整个数据集。EGMM的参数是用专门设计的预期-最大化算法估算的。还提供了允许自动确定适当组群数量的有效指数。拟议的群集与典型的GMM一样方便,但可以为考虑的数据集产生更丰富的信息性证据分布。合成和真实数据集的实验表明,与其他一些原型和模型基集技术相比,拟议方法的绩效良好。