In this note, we investigate how well we can reconstruct the best rank-$r$ approximation of a large matrix from a small number of its entries. We show that even if a data matrix is of full rank and cannot be approximated well by a low-rank matrix, its best low-rank approximations may still be reliably computed or estimated from a small number of its entries. This is especially relevant from a statistical viewpoint: the best low-rank approximations to a data matrix are often of more interest than itself because they capture the more stable and oftentimes more reproducible properties of an otherwise complicated data-generating model. In particular, we investigate two agnostic approaches: the first is based on spectral truncation; and the second is a projected gradient descent based optimization procedure. We argue that, while the first approach is intuitive and reasonably effective, the latter has far superior performance in general. We show that the error depends on how close the matrix is to being of low rank. Both theoretical and numerical evidence is presented to demonstrate the effectiveness of the proposed approaches.
翻译:在本说明中,我们调查我们怎样才能从少数条目中重建一个大矩阵的最优等级-美元近似值。我们表明,即使一个数据矩阵是全级的,不能用低级矩阵来很好地估计,它的最佳低级近似值仍然可以从少数条目中可靠地计算或估计。从统计角度看,这一点特别相关:数据矩阵中最优的低级近似值往往比它本身更有意义,因为它们捕捉了一个本来比较复杂的数据生成模型的较稳定而且往往更易复制的特性。特别是,我们调查了两种不可知性方法:第一个方法以光谱脱线为基础;第二个方法以预测的梯度下降优化程序为基础。我们争辩说,虽然第一种方法直观和相当有效,但后者一般的性能要高得多。我们表明,错误取决于矩阵与低级的距离有多近。我们提出了理论证据和数字证据,以证明拟议方法的有效性。