PPGMMGA is a Projection Pursuit (PP) algorithm aimed at detecting and visualizing clustering structures in multivariate data. The algorithm uses the negentropy as PP index obtained by fitting Gaussian Mixture Models (GMMs) for density estimation, and then optimized using Genetic Algorithms (GAs). Since the PPGMMGA algorithm is a dimension reduction technique specifically introduced for visualization purposes, cluster memberships are not explicitly provided. In this paper a modal clustering approach is proposed for estimating clusters of projected data points. In particular, a modal EM algorithm is employed to estimate the modes corresponding to the local maxima in the projection subspace of the underlying density estimated using parsimonious GMMs. Data points are then clustered according to the domain of attraction of the identified modes. Simulated and real data are discussed to illustrate the proposed method and evaluate the clustering performance.
翻译:PPGMGA(PPGMGA)算法旨在探测多变数据中的集群结构并使其具有可视性,该算法将神经机率作为PP指数,该算法通过安装高森混合模型(GMMS)获得,用于密度估计,然后利用遗传算法优化。由于PPPGMGA算法是专门为可视化目的引入的减少维度技术,因此没有明确提供集群成员。本文提出了模型集法,用于估计预测的数据点群集。特别是,使用模型EM算法来估计与预测基密度的预测子空间中与使用可感光化的GMMS估计的本地最大值相对应的模式。数据点随后按照所确定模式的吸引力领域分组。讨论了模拟和真实数据,以说明拟议的方法和评估组合性。