A new index for internal evaluation of clustering is introduced. The index is defined as a mixture of two sub-indices. The first sub-index $ I_a $ is called the Ambiguous Index; the second sub-index $ I_s $ is called the Similarity Index. Calculation of the two sub-indices is based on density estimation to each cluster of a partition of the data. An experiment is conducted to test the performance of the new index, and compared with six other internal clustering evaluation indices -- Calinski-Harabasz index, Silhouette coefficient, Davies-Bouldin index, CDbw, DBCV, and VIASCKDE, on a set of 145 datasets. The result shows the new index significantly improves other internal clustering evaluation indices.
翻译:引入了一个新的集群内部评价指数。该指数被定义为两个子指数的混合体。第一个分指数I_a美元称为模糊指数;第二个分指数I_s美元称为相似指数。两个分指数的计算基于对数据分类的每个组群的密度估计。进行了一项实验以测试新指数的性能,与其他六个内部分类评价指数 -- -- Calinski-Harabasz指数、Silhouette系数、Davies-Bouldin指数、CDbw、DBBCV和VIASCKDE -- -- 相比,新指数大大改进了其他内部分类评价指数。