Low rank tensor approximation is a fundamental tool in modern machine learning and data science. In this paper, we study the characterization, perturbation analysis, and an efficient sampling strategy for two primary tensor CUR approximations, namely Chidori and Fiber CUR. We characterize exact tensor CUR decompositions for low multilinear rank tensors. We also present theoretical error bounds of the tensor CUR approximations when (adversarial or Gaussian) noise appears. Moreover, we show that low cost uniform sampling is sufficient for tensor CUR approximations if the tensor has an incoherent structure. Empirical performance evaluations, with both synthetic and real-world datasets, establish the speed advantage of the tensor CUR approximations over other state-of-the-art low multilinear rank tensor approximations.
翻译:高压近似值是现代机器学习和数据科学的一个基本工具。 在本文中,我们研究了两种主要高压CUR近似值(Chidori和Fiber CUR)的定性、扰动分析以及高效抽样战略。我们将低多线级高压电离值的强压CUR分解精确定性为强压CUR分解。当(对抗性或高斯)噪音出现时,我们也呈现了高压CUR近似值的理论误差界限。此外,我们表明,低成本统一抽样足以使高压CUR近似值(如果抗拉结构不连贯 ) 。 实证性业绩评估(包括合成和真实世界数据集) 确定了高压CUR近似值相对于其他最先进的低多线近似值的快速优势。