This work considers the problem of computing the CANDECOMP/PARAFAC (CP) decomposition of large tensors. One popular way is to translate the problem into a sequence of overdetermined least squares subproblems with Khatri-Rao product (KRP) structure. In this work, for tensor with different levels of importance in each fiber, combining stochastic optimization with randomized sampling, we present a mini-batch stochastic gradient descent algorithm with importance sampling for those special least squares subproblems. Four different sampling strategies are provided. They can avoid forming the full KRP or corresponding probabilities and sample the desired fibers from the original tensor directly. Moreover, a more practical algorithm with adaptive step size is also given. For the proposed algorithms, we present their convergence properties and numerical performance. The results on synthetic data show that our algorithms outperform the existing algorithms in terms of accuracy or the number of iterations.
翻译:这项工作考虑了计算大型电压分解的问题。 一种流行的方法是将问题转化成一个由Khatri- Rao产品结构构成的、 定型最小平方子问题序列。 在这项工作中, 对于每根纤维具有不同重要性、 将随机优化与随机抽样相结合的数组, 我们提出了一个小型的随机梯度梯度下行算法, 对这些特殊最小方次问题进行重要取样。 提供了四种不同的取样策略。 它们可以避免形成完整的 KRP 或相应的概率, 直接从原始的电压中抽取所需的纤维。 此外, 还给出了一个更实用的、 适应性步骤大小的算法。 对于提议的算法, 我们展示了它们的趋同性与数字性能。 合成数据的结果显示, 我们的算法在准确性或迭代数方面超过了现有的算法。