DBSCAN has been widely used in density-based clustering algorithms. However, with the increasing demand for Multi-density clustering, previous traditional DSBCAN can not have good clustering results on Multi-density datasets. In order to address this problem, an adaptive Multi-density DBSCAN algorithm (AMD-DBSCAN) is proposed in this paper. An improved parameter adaptation method is proposed in AMD-DBSCAN to search for multiple parameter pairs (i.e., Eps and MinPts), which are the key parameters to determine the clustering results and performance, therefore allowing the model to be applied to Multi-density datasets. Moreover, only one hyperparameter is required for AMD-DBSCAN to avoid the complicated repetitive initialization operations. Furthermore, the variance of the number of neighbors (VNN) is proposed to measure the difference in density between each cluster. The experimental results show that our AMD-DBSCAN reduces execution time by an average of 75% due to lower algorithm complexity compared with the traditional adaptive algorithm. In addition, AMD-DBSCAN improves accuracy by 24.7% on average over the state-of-the-art design on Multi-density datasets of extremely variable density, while having no performance loss in Single-density scenarios.
翻译:在基于密度的集群算法中广泛使用DBSCAN。然而,随着对多密度组群的需求不断增加,以往传统的DSBCAN无法在多密度数据集上产生良好的组合结果。为了解决这一问题,本文件建议采用适应性多密度DBSCAN算法(AMD-DBSCAN算法(AMD-DBSCAN算法));在AMD-DBSCAN中建议改进参数调整方法,以寻找多种参数对(即,Eps和MinPts),这是确定组合结果和性能的关键参数,因此允许将模型应用于多密度数据集。此外,AMD-DBSCAN只需使用一个超度参数来避免复杂的重复初始化操作。此外,为了衡量每个组群群之间密度差异,建议了邻居数量的差异。实验结果表明,我们的AMDBSCAN由于传统的适应算法复杂性较低,执行时间平均减少75%。此外,AMDBSCAN系统对多密度设计模型的精确度没有提高24.7%,而单一密度的模型的精确度则高于平均损失率。