We consider model-based clustering methods for continuous, correlated data that account for external information available in the presence of mixed-type fixed covariates by proposing the MoEClust suite of models. These models allow different subsets of covariates to influence the component weights and/or component densities by modelling the parameters of the mixture as functions of the covariates. A familiar range of constrained eigen-decomposition parameterisations of the component covariance matrices are also accommodated. This paper thus addresses the equivalent aims of including covariates in Gaussian parsimonious clustering models and incorporating parsimonious covariance structures into all special cases of the Gaussian mixture of experts framework. The MoEClust models demonstrate significant improvement from both perspectives in applications to both univariate and multivariate data sets. Novel extensions to include a uniform noise component for capturing outliers and to address initialisation of the EM algorithm, model selection, and the visualisation of results are also proposed.
翻译:我们认为,在混合型固定共变模式存在的情况下,用于计算外部信息的连续、相关数据的基于模型的集群方法,通过提出混合型固定共变模式组合,这些模型允许不同的共变子子子组通过模拟混合参数作为共变变量的函数来影响组成部分的重量和/或组成部分密度,还容纳了组合共变矩阵中常见的受限制的易分解参数范围,因此,本文件述及了将高山可口型组合模型的共变体纳入高斯混合专家框架所有特殊案例的同等目标,从对单体和多变量数据集的应用两个角度来看,混合共变子组模型显示出显著的改进。还提出了新扩展,以纳入统一的噪音组成部分,用于捕捉外源并处理电磁算法的初始化、模型选择和结果的可视化。