This work studies the classical spectral clustering algorithm which embeds the vertices of some graph $G=(V_G, E_G)$ into $\mathbb{R}^k$ using $k$ eigenvectors of some matrix of $G$, and applies $k$-means to partition $V_G$ into $k$ clusters. Our first result is a tighter analysis on the performance of spectral clustering, and explains why it works under some much weaker condition than the ones studied in the literature. For the second result, we show that, by applying fewer than $k$ eigenvectors to construct the embedding, spectral clustering is able to produce better output for many practical instances; this result is the first of its kind in spectral clustering. Besides its conceptual and theoretical significance, the practical impact of our work is demonstrated by the empirical analysis on both synthetic and real-world datasets, in which spectral clustering produces comparable or better results with fewer than $k$ eigenvectors.
翻译:这项工作研究古典光谱群集算法,它将一些G=(V_G,E_G)的顶部嵌入一个G$(mathbb{R ⁇ k$)的图形,用一些G$矩阵的美元元元元元计算,将一些G$(K$)的顶部嵌入一个G$(V_G$)的顶部,并应用K$(K$)的方法将V_G$(G$)分成一个K美元组群。我们的第一个结果就是对光谱集的性能进行更严格的分析,并解释为什么它的工作条件比文献中研究的低得多。第二个结果是,我们通过光谱集在构建嵌入器中应用不到$(K)的源头部,能够产生更好的产出;这是光谱集中第一种类型的产出。除了其概念和理论意义外,我们对合成和真实世界数据集的经验分析也证明了我们的工作的实际影响,即光谱组的产生可比较或更好的结果,只有不到$k美元。