Spectral clustering is one of the most popular clustering methods. However, the high computational cost due to the involved eigen-decomposition procedure can immediately hinder its applications in large-scale tasks. In this paper we use spectrum-preserving node reduction to accelerate eigen-decomposition and generate concise representations of data sets. Specifically, we create a small number of pseudonodes based on spectral similarity. Then, standard spectral clustering algorithm is performed on the smaller node set. Finally, each data point in the original data set is assigned to the cluster as its representative pseudo-node. The proposed framework run in nearly-linear time. Meanwhile, the clustering accuracy can be significantly improved by mining concise representations. The experimental results show dramatically improved clustering performance when compared with state-of-the-art methods.
翻译:光谱聚变是最受欢迎的集束方法之一,然而,由于所涉及的乙基因分解程序而导致的高计算成本会立即妨碍其在大规模任务中的应用。在本文件中,我们使用频谱保存节点减少来加速乙基因分解并生成数据集的简明表述。具体地说,我们根据光谱相似性创建了少量假球。然后,在较小的节点上进行标准光谱聚变算法。最后,原始数据集中的每个数据点被指定给该组,作为其具有代表性的假点。拟议的框架在近线性时间运行。与此同时,通过开采简明的表述可以大大改进集群的准确性。实验结果显示,与最先进的方法相比,集群的性能显著提高。