In this paper, we introduce a new methodology to solve the orthogonal nonnegative matrix factorization (ONMF) problem, where the objective is to approximate an input data matrix by a product of two nonnegative matrices, the features matrix and the mixing matrix, where one of them is orthogonal. We show how the ONMF can be interpreted as a specific facility-location problem (FLP), and adapt a maximum-entropy-principle based solution for FLP to the ONMF problem. The proposed approach guarantees orthogonality and sparsity of the features or the mixing matrix, while ensuring nonnegativity of both. Additionally, our methodology develops a quantitative characterization of ``true" number of underlying features - a hyperparameter required for the ONMF. An evaluation of the proposed method conducted on synthetic datasets, as well as a standard genetic microarray dataset indicates significantly better sparsity, orthogonality, and performance speed compared to similar methods in the literature, with comparable or improved reconstruction errors.
翻译:本文提出了一种解决正交非负矩阵分解 (ONMF) 问题的新方法,其中目标是通过两个非负矩阵的乘积(特征矩阵和混合矩阵)逼近输入数据矩阵,其中一个矩阵是正交的。我们展示了如何将ONMF解释为特定的设施位置问题 (FLP),并将基于最大熵原理的FLP解决方案适应到ONMF问题上。所提出方法确保了特征矩阵或混合矩阵的正交性和稀疏性,同时确保两者都是非负的。此外,我们的方法还开发了关于 "真实" 底层特征数量的定量特征 - 这是ONMF所需的一个超参数。我们对合成数据集和标准基因微阵列数据集进行的方法评估表明,在相当或更好的重构误差的情况下,与文献中类似的方法相比,我们的方法具有更好的稀疏性、正交性和性能速度。