Many state-of-the-art subspace clustering methods follow a two-step process by first constructing an affinity matrix between data points and then applying spectral clustering to this affinity. Most of the research into these methods focuses on the first step of generating the affinity matrix, which often exploits the self-expressive property of linear subspaces, with little consideration typically given to the spectral clustering step that produces the final clustering. Moreover, existing methods obtain the affinity by applying ad-hoc postprocessing steps to the self-expressive representation of the data, and this postprocessing can have a significant impact on the subsequent spectral clustering step. In this work, we propose to unify these two steps by jointly learning both a self-expressive representation of the data and an affinity matrix that is well-normalized for spectral clustering. In the proposed model, we constrain the affinity matrix to be doubly stochastic, which results in a principled method for affinity matrix normalization while also exploiting the known benefits of doubly stochastic normalization in spectral clustering. While our proposed model is non-convex, we give a convex relaxation that is provably equivalent in many regimes; we also develop an efficient approximation to the full model that works well in practice. Experiments show that our method achieves state-of-the-art subspace clustering performance on many common datasets in computer vision.
翻译:许多最先进的子空间分组方法遵循一个两步过程,先在数据点之间构建一个近距离矩阵,然后将光谱集群用于此近距离关系。这些方法的研究大多侧重于产生近距离矩阵的第一步,它往往利用线性子空间的自我表达属性,而很少考虑产生最后集群的光谱集群步骤。此外,现有方法通过对数据自我表达的表达方式应用自动混合后处理步骤而取得近距离性,而这一后处理可对随后的光谱集群步骤产生重大影响。在这项工作中,我们提议通过共同学习数据自我表达的描述和光谱集群非常典型的亲近性矩阵来统一这两个步骤。在拟议的模型中,我们将亲近距离矩阵限制为双向随机性,从而形成一种关于亲近性矩阵正常化的原则性方法,同时利用光谱组合中双向互交错式正常化的已知好处。虽然我们提议的模型是非自我表达的自我表达方式,但我们在模型中也展示了一种完全的同步性化方法。我们提出的模型在模型中也展示了一种共同的模型,我们实现了一种完全的精确化的模型。