Spectral clustering is one of the most popular clustering methods. However, how to balance the efficiency and effectiveness of the large-scale spectral clustering with limited computing resources has not been properly solved for a long time. In this paper, we propose a divide-and-conquer based large-scale spectral clustering method to strike a good balance between efficiency and effectiveness. In the proposed method, a divide-and-conquer based landmark selection algorithm and a novel approximate similarity matrix approach are designed to construct a sparse similarity matrix within extremely low cost. Then clustering results can be computed quickly through a bipartite graph partition process. The proposed method achieves the lower computational complexity than most existing large-scale spectral clustering. Experimental results on ten large-scale datasets have demonstrated the efficiency and effectiveness of the proposed methods. The MATLAB code of the proposed method and experimental datasets are available at https://github.com/Li-Hongmin/MyPaperWithCode.
翻译:光谱群集是最受欢迎的群集方法之一。 但是,如何以有限的计算资源来平衡大型光谱群集的效率和效力的问题长期没有得到妥善解决。 在本文中,我们建议采用基于分解和征服的大型光谱群集方法,以便在效率和有效性之间达成良好的平衡。在拟议方法中,基于分分解和分解的里程碑式选择算法和新的近似相似的矩阵法,目的是在极低的成本中构建一个稀少的相似矩阵。然后,可以通过双方图形分割过程快速计算集成结果。拟议方法的计算复杂性低于大多数现有的大型光谱群集。10个大型数据集的实验结果显示了拟议方法的效率和有效性。在https://github.com/Li-Hongmin/MyPaperOneCode中,可查到拟议方法和实验数据集的MATLAB代码。