通用杠杆分数:几何解释和应用 (Generalized Leverage Scores: Geometric Interpretation and Applications)

In problems involving matrix computations, the concept of leverage has found a large number of applications. In particular, leverage scores, which relate the columns of a matrix to the subspaces spanned by its leading singular vectors, are helpful in revealing column subsets to approximately factorize a matrix with quality guarantees. As such, they provide a solid foundation for a variety of machine-learning methods. In this paper we extend the definition of leverage scores to relate the columns of a matrix to arbitrary subsets of singular vectors. We establish a precise connection between column and singular-vector subsets, by relating the concepts of leverage scores and principal angles between subspaces. We employ this result to design approximation algorithms with provable guarantees for two well-known problems: generalized column subset selection and sparse canonical correlation analysis. We run numerical experiments to provide further insight on the proposed methods. The novel bounds we derive improve our understanding of fundamental concepts in matrix approximations. In addition, our insights may serve as building blocks for further contributions.

翻译：在涉及矩阵计算的问题中,杠杆概念发现了大量应用,特别是将矩阵的柱子与以其领先的单向矢量覆盖的子空间的子空间联系起来的杠杆分数,有助于揭示列子子集,以大致地将具有质量保证的矩阵乘以。因此,它们为各种机器学习方法提供了坚实的基础。在本文件中,我们扩大了杠杆分数的定义,将矩阵的柱子与单矢量的任意子集联系起来。我们通过将杠杆分数概念和子空间之间主要角度联系起来,在柱子和单向子子中建立精确的联系。我们利用这一结果设计近似算法,为两个众所周知的问题提供可行的保证:普通的子集选用和稀有的罐子关联分析。我们进行了数字实验,以进一步了解拟议方法。我们从新的界限中增进了我们对矩阵近似值基本概念的理解。此外,我们的洞察力可以作为进一步贡献的基础。