Symmetric nonnegative matrix factorization (SNMF) has demonstrated to be a powerful method for data clustering. However, SNMF is mathematically formulated as a non-convex optimization problem, making it sensitive to the initialization of variables. Inspired by ensemble clustering that aims to seek a better clustering result from a set of clustering results, we propose self-supervised SNMF (S$^3$NMF), which is capable of boosting clustering performance progressively by taking advantage of the sensitivity to initialization characteristic of SNMF, without relying on any additional information. Specifically, we first perform SNMF repeatedly with a random nonnegative matrix for initialization each time, leading to multiple decomposed matrices. Then, we rank the quality of the resulting matrices with adaptively learned weights, from which a new similarity matrix that is expected to be more discriminative is reconstructed for SNMF again. These two steps are iterated until the stopping criterion/maximum number of iterations is achieved. We mathematically formulate S$^3$NMF as a constraint optimization problem, and provide an alternative optimization algorithm to solve it with the theoretical convergence guaranteed. Extensive experimental results on $10$ commonly used benchmark datasets demonstrate the significant advantage of our S$^3$NMF over $12$ state-of-the-art methods in terms of $5$ quantitative metrics. The source code is publicly available at https://github.com/jyh-learning/SSSNMF.
翻译:事实表明,SNMF是一个强大的数据分组方法。然而,SNMF在数学上是作为一个非Convex优化化问题而成的,因此对变量的初始化十分敏感。受旨在寻求一组组合结果产生更好组合结果的更好组合组合的混合组合的启发,我们提议自我监督的SNMF(SNMF)(SNMF$3$NMF),它能够利用对SNMF初始化特征的敏感性,在不依赖任何额外信息的情况下,逐步提高集群的性能。具体地说,我们首次使用随机的非SNMF,以随机的非SNMF的形式进行,每次初始化时使用一个随机的非NMF矩阵,导致多个分解矩阵的初始化。然后,我们用适应性学习的重量来对由此形成的矩阵的质量进行排序,从中再次重建一个预期更具歧视性的新的类似矩阵。在停止标准/最大数目之前,这两个步骤是循环化的,我们用SNMF(SNMF)量化的S$3$作为制约性美元初始化费用优化问题,并展示了我们所使用的大量实验性标准标准基准,用以解决了SNMASBR(SD)基准值基础化的10-ralalbalalalalalalalalalalalalalbalalgalgalgalgalgalgalgalbisaldaldaldaldisaldisaldisals)。