The majority of finite mixture models suffer from not allowing asymmetric tail dependencies within components and not capturing non-elliptical clusters in clustering applications. Since vine copulas are very flexible in capturing these types of dependencies, we propose a novel vine copula mixture model for continuous data. We discuss the model selection and parameter estimation problems and further formulate a new model-based clustering algorithm. The use of vine copulas in clustering allows for a range of shapes and dependency structures for the clusters. Our simulation experiments illustrate a significant gain in clustering accuracy when notably asymmetric tail dependencies or/and non-Gaussian margins within the components exist. The analysis of real data sets accompanies the proposed method. We show that the model-based clustering algorithm with vine copula mixture models outperforms the other model-based clustering techniques, especially for the non-Gaussian multivariate data.
翻译:大多数有限的混合物模型都因不允许各组成部分内部的不对称尾部依赖性而受到影响,而且没有在集群应用中捕捉非螺旋型群集而受到影响。由于在捕捉这些类型的依赖性时,葡萄干椰子非常灵活,我们提议了一种用于连续数据的新型葡萄干混合模型。我们讨论了模型选择和参数估计问题,并进一步制定了基于模型的新组合算法。在集群中使用葡萄干椰子可以使各组群有一系列形状和依赖性结构。我们的模拟实验表明,在集群精确性方面有很大的收益,因为各组成部分中存在明显的不对称尾部依赖性或/以及非加西语边距。对真实数据的分析结合了拟议方法。我们表明,与醋干草混合物模型有关的基于模型的集群算法比其他基于模型的集群技术,特别是非加西语多变量数据。