Non-negative matrix and tensor factorisations are a classical tool in machine learning and data science for finding low-dimensional representations of high-dimensional datasets. In applications such as imaging, datasets can often be regarded as distributions in a space with metric structure. In such a setting, a Wasserstein loss function based on optimal transportation theory is a natural choice since it incorporates knowledge about the geometry of the underlying space. We introduce a general mathematical framework for computing non-negative factorisations of matrices and tensors with respect to an optimal transport loss, and derive an efficient method for its solution using a convex dual formulation. We demonstrate the applicability of this approach with several numerical examples.
翻译:在机器学习和数据科学中,非负矩阵和推力因素是发现高维数据集的低维表现的经典工具。在成像等应用中,数据集通常可被视为具有度结构的空间的分布。在这种环境下,基于最佳运输理论的瓦瑟斯坦损耗功能是一种自然选择,因为它包含了对基础空间几何学的知识。我们引入了一个用于计算矩阵和电压非负因数的最佳运输损失的一般数学框架,并用一个共振双重配方为解决办法提供了有效的方法。我们用几个数字例子来证明这种方法的适用性。