Non-negative matrix and tensor factorisations are a classical tool for finding low-dimensional representations of high-dimensional datasets. In applications such as imaging, datasets can be regarded as distributions supported on a space with metric structure. In such a setting, a loss function based on the Wasserstein distance of optimal transportation theory is a natural choice since it incorporates the underlying geometry of the data. We introduce a general mathematical framework for computing non-negative factorisations of both matrices and tensors with respect to an optimal transport loss. We derive an efficient computational method for its solution using a convex dual formulation, and demonstrate the applicability of this approach with several numerical illustrations with both matrix and tensor-valued data.
翻译:非负矩阵和推力系数是发现高维数据集的低维表示法的经典工具。在成像等应用中,数据集可被视为在有度结构的空间上支持的分布。在这种环境下,基于最佳运输理论瓦塞斯坦距离的损耗函数是一种自然选择,因为它包含数据的基本几何。我们引入了一个用于计算矩阵和电压在最佳运输损失方面的非负因数的一般数学框架。我们用二次曲线的双重配方为其解决方案获取一种高效的计算方法,并用矩阵和高压数据的若干数字说明来证明这一方法的适用性。