In this paper, we introduce a new methodology to solve the orthogonal non-negative matrix factorization (ONMF) problem, where the objective is to approximate an input data matrix by the product of two non-negative matrices, the features matrix and the mixing matrix, while one of them is orthogonal. We show how the ONMF can be interpreted as a specific facility-location problem (FLP), and adapt a maximum-entropy-principle based solution for FLP to the ONMF problem. The proposed approach guarantees orthogonality of the features or the mixing matrix, while ensuring that both of the matrix factors are non-negative. Also, the features (mixing) matrix has exactly one non-zero element across each row (column), providing the maximum sparsity of the orthogonal factor. This enables a semantic interpretation of the underlying data matrix using non-overlapping features. The experiments on synthetic data and a standard microarray dataset demonstrate significant improvements in terms of sparsity and orthogonality scores of features (mixing) matrices, while achieving approximately the same or better (up to 3%) reconstruction errors.
翻译:在本文中,我们引入了一种新的方法来解决正向非负矩阵因子化(ONMF)问题,目的是通过两个非负矩阵、特征矩阵和混合矩阵的产物来接近一个输入数据矩阵,而其中的一个是正向矩阵。我们展示了如何将ONMF解释为一个具体的设施定位问题(FLP),并采用基于FLP的最大内向性原则的解决方案来应对 ONMF问题。提议的方法保证了特征或混合矩阵的正向性,同时确保这两个矩阵因素都是非负性的。此外,特征(混合)矩阵在每行各行各有一个完全的非零要素(列),提供了圆形要素的最大宽度。这样就能够使用非重叠特征对基本数据矩阵进行语义解释。合成数据实验和标准的微阵列数据集表明,在特征(混合)矩阵的粘度和分数(混合)分数方面有显著的改进,同时实现大约相同或更好的差差差差率。