This paper presents a multiple learner algorithm called the 'Three Ensemble Clustering 3EC' algorithm that classifies unlabeled data into quality clusters as a part of unsupervised learning. It offers the flexibility to explore the context of new clusters formed by an ensemble of algorithms based on internal validation indices. It is worth mentioning that the input data set is considered to be a cluster of clusters. An anomaly can possibly manifest as a cluster as well. Each partitioned cluster is considered to be a new data set and is a candidate to explore the most optimal algorithm and its number of partition splits until a predefined stopping criteria is met. The algorithms independently partition the data set into clusters and the quality of the partitioning is assessed by an ensemble of internal cluster validation indices. The 3EC algorithm presents the validation index scores from a choice of algorithms and its configuration of partitions and it is called the Tau Grid. 3EC chooses the most optimal score. The 3EC algorithm owes its name to the two input ensembles of algorithms and internal validation indices and an output ensemble of final clusters. Quality plays an important role in this clustering approach and it also acts as a stopping criteria from further partitioning. Quality is determined based on the quality of the clusters provided by an algorithm and its optimal number of splits. The 3EC algorithm determines this from the score of the ensemble of validation indices. The user can configure the stopping criteria by providing quality thresholds for the score range of each of the validation indices and the optimal size of the output cluster. The users can experiment with different sets of stopping criteria and choose the most 'sensible group' of quality clusters
翻译:本文展示了一个叫“ 三个组合组合组合 3EC ” 的多重学习者算法, 将未标记的数据分类为质量组群, 作为不受监督的学习的一部分。 它为探索由基于内部验证指数的算法组合构成的新组群的背景提供了灵活性。 值得提及的是, 输入数据集被视为是一个组群的群集。 异常也可以作为一个组群来表现。 每个分割的组群都被认为是一个新的数据集, 是探索最优化的算法及其分区拆分数数量的候选算法, 直到预定义的停止标准达到为止。 算法将数据集独立分割为质量组群, 分配质量组的质量由内部群集验证指数的合集来评估。 3EC 算法显示了从一个算法及其分区组合组合组合中选择的验证指数。 3EC 3 3 算法将它的名字归功于两种输入的算法组合和内部验证指数以及最终组群集的输出值。 QQQI 质量的精度将决定其最佳质量评分数的精度, 也可以通过这个组的精度方法, 提供其最佳的精度计算质量分类数的精度, 。 QILILILIL 的精度可以确定, 3 。