This paper provides a theoretical support for clustering aspect of the nonnegative matrix factorization (NMF). By utilizing the Karush-Kuhn-Tucker optimality conditions, we show that NMF objective is equivalent to graph clustering objective, so clustering aspect of the NMF has a solid justification. Different from previous approaches which usually discard the nonnegativity constraints, our approach guarantees the stationary point being used in deriving the equivalence is located on the feasible region in the nonnegative orthant. Additionally, since clustering capability of a matrix decomposition technique can sometimes imply its latent semantic indexing (LSI) aspect, we will also evaluate LSI aspect of the NMF by showing its capability in solving the synonymy and polysemy problems in synthetic datasets. And more extensive evaluation will be conducted by comparing LSI performances of the NMF and the singular value decomposition (SVD), the standard LSI method, using some standard datasets.
翻译:本文为非负矩阵系数化(NMF)的组群方面提供了理论支持。 通过利用Karush-Kuhn-Tucker的最佳条件,我们表明NMF的目标相当于图形群集目标,因此NMF的组群方面有坚实的理由。与以往通常放弃非负数限制的方法不同,我们的方法保证得出等值所用的固定点位于非负数或非正数的可行区域。此外,由于矩阵分解技术的组群能力有时可能意味着其潜在的语义索引化(LSI)方面,我们还将评估NMF的LSI方面,展示其解决合成数据集中共语和多细胞问题的能力。将使用一些标准数据集比较LIF的LSI性能和单值分解层(SVD)这一标准LSI方法,进行更广泛的评估。