In this work, we introduce a nonparametric clustering stopping rule algorithm based on the spatial median. Our proposed method aims to achieve the balance between the homogeneity within the clusters and the heterogeneity between clusters. The proposed algorithm maximises the ratio of the variation between clusters and the variation within clusters while adjusting for the number of clusters and number of observations. The proposed algorithm is robust against distributional assumptions and the presence of outliers. Simulations have been used to validate the algorithm. We further evaluated the stability and the efficacy of the proposed algorithm using three real-world datasets. Moreover, we compared the performance of our model with 13 other traditional algorithms for determining the number of clusters. We found that the proposed algorithm outperformed 11 of the algorithms considered for comparison in terms of clustering number determination. The finding demonstrates that the proposed method provides a reliable alternative to determine the number of clusters for multivariate data.
翻译:暂无翻译