This work considers low-rank canonical polyadic decomposition (CPD) under a class of non-Euclidean loss functions that frequently arise in statistical machine learning and signal processing. These loss functions are often used for certain types of tensor data, e.g., count and binary tensors, where the least squares loss is considered unnatural.Compared to the least squares loss, the non-Euclidean losses are generally more challenging to handle. Non-Euclidean CPD has attracted considerable interests and a number of prior works exist. However, pressing computational and theoretical challenges, such as scalability and convergence issues, still remain. This work offers a unified stochastic algorithmic framework for large-scale CPD decomposition under a variety of non-Euclidean loss functions. Our key contribution lies in a tensor fiber sampling strategy-based flexible stochastic mirror descent framework. Leveraging the sampling scheme and the multilinear algebraic structure of low-rank tensors, the proposed lightweight algorithm ensures global convergence to a stationary point under reasonable conditions. Numerical results show that our framework attains promising non-Euclidean CPD performance. The proposed framework also exhibits substantial computational savings compared to state-of-the-art methods.
翻译:这项工作考虑到在统计机学习和信号处理过程中经常出现的非欧洲语言损失功能类别下低端的碳化多元分解(CPD),这些损失功能往往用于某些类型的抗拉数据,如计数和二进制加仑,其中最小的平方损失被认为是不自然的。与最小的平方损失相比,非欧洲语言损失通常更难处理。非欧洲语言的分解(CPD)吸引了大量兴趣,以前也有一些作品。然而,在统计机学习和信号处理过程中经常出现的紧迫的计算和理论挑战,如可缩放性和趋同问题,依然存在。这项工作为大规模CPD分解提供了统一的随机分析算法框架,在各种非欧洲语言损失功能下,大规模CPD解析被认为是不自然的。我们的关键贡献在于以最小的平方位损失为基础的抗拉动纤维取样战略弹性随机镜状血缘框架。利用取样计划和低级压压的多线性高血压结构,拟议的轻重算算法确保全球在合理条件下与固定点的趋同的数值。