Low rank tensor approximation is a fundamental tool in modern machine learning and data science. In this paper, we study the characterization, perturbation analysis, and an efficient sampling strategy for two primary tensor CUR approximations, namely Chidori and Fiber CUR. We characterize exact tensor CUR decompositions for low multilinear rank tensors. We also present theoretical error bound of the tensor CUR approximations when (adversarial or Gaussian) noise appears. Moreover, we show that low cost uniform sampling is sufficient for tensor CUR approximations if the tensor has an incoherent structure. Empirical performance evaluations, with both synthetic and real-world datasets, establish the advantage of the tensor CUR approximations over other state-of-the-art low multilinear rank tensor approximations.
翻译:高压近似值是现代机器学习和数据科学的一个基本工具。 在本文中,我们研究了两种主要高压CUR近似值(Chidori和Fiber CUR)的定性、扰动分析以及高效抽样战略。我们把低多线级高压的强压CUR分解精确定性为高压CUR分解。当(对抗性或高斯)噪音出现时,我们也呈现高压CUR近近似值的理论错误。此外,我们表明,如果高压近似值具有不相容的结构,低成本统一取样就足以满足高压CUR近似值。 实证性业绩评估,包括合成和真实世界数据集,确立了高压CUR近似值相对于其他最先进的低多线级高压近似值的优势。