Ensemble clustering integrates a set of base clustering results to generate a stronger one. Existing methods usually rely on a co-association (CA) matrix that measures how many times two samples are grouped into the same cluster according to the base clusterings to achieve ensemble clustering. However, when the constructed CA matrix is of low quality, the performance will degrade. In this paper, we propose a simple yet effective CA matrix self-enhancement framework that can improve the CA matrix to achieve better clustering performance. Specifically, we first extract the high-confidence (HC) information from the base clusterings to form a sparse HC matrix. By propagating the highly-reliable information of the HC matrix to the CA matrix and complementing the HC matrix according to the CA matrix simultaneously, the proposed method generates an enhanced CA matrix for better clustering. Technically, the proposed model is formulated as a symmetric constrained convex optimization problem, which is efficiently solved by an alternating iterative algorithm with convergence and global optimum theoretically guaranteed. Extensive experimental comparisons with twelve state-of-the-art methods on eight benchmark datasets substantiate the effectiveness, flexibility and efficiency of the proposed model in ensemble clustering. The codes and datasets can be downloaded at https://github.com/Siritao/EC-CMS.
翻译:组合组合组合将一组基础群集结果整合为一组更强的组合结果。 现有方法通常依赖于一个共同联合(CA)矩阵,该矩阵根据基本组群衡量将多少倍于多少乘以两个样本的样本按照基组组群归为同一组群,然而,当已建的CA矩阵质量低时,性能会下降。 在本文件中,我们提出一个简单而有效的CA矩阵自我增强框架,可以改进CA矩阵,从而实现更好的组合性能。具体地说,我们首先从基组组中提取高信任(HC)信息,以形成稀疏的 HC矩阵矩阵。通过将HC矩阵的高度可靠信息传播到CA矩阵中,同时根据CA矩阵补充HC矩阵,拟议的方法将产生强化的CA矩阵,以更好地组合。在技术上,拟议模式的形成一个对称制约的调调调调调调调优化问题,通过交替调和最佳理论保证的调试解决。 在8个基准数据集中,与12个状态-艺术方法进行广泛的实验性比较。 在8个基准组群集中,可以证实Sam- basimal- am am am am am am am am am am am am am am amb amb/palation amation amations/palb amationalbs/pal