This paper studies low-rank matrix completion in the presence of heavy-tailed and possibly asymmetric noise, where we aim to estimate an underlying low-rank matrix given a set of highly incomplete noisy entries. Though the matrix completion problem has attracted much attention in the past decade, there is still lack of theoretical understanding when the observations are contaminated by heavy-tailed noises. Prior theory falls short of explaining the empirical results and is unable to capture the optimal dependence of the estimation error on the noise level. In this paper, we adopt an adaptive Huber loss to accommodate heavy-tailed noise, which is robust against large and possibly asymmetric errors when the parameter in the loss function is carefully designed to balance the Huberization biases and robustness to outliers. Then, we propose an efficient nonconvex algorithm via a balanced low-rank Burer-Monteiro matrix factorization and gradient decent with robust spectral initialization. We prove that under merely bounded second moment condition on the error distributions, rather than the sub-Gaussian assumption, the Euclidean error of the iterates generated by the proposed algorithm decrease geometrically fast until achieving a minimax-optimal statistical estimation error, which has the same order as that in the sub-Gaussian case. The key technique behind this significant advancement is a powerful leave-one-out analysis framework. The theoretical results are corroborated by our simulation studies.
翻译:本文研究的是,在出现重尾和可能不对称噪音的情况下,低空矩阵完成率,我们的目标是根据一组高度不完全的噪音来估计一个基本低级矩阵完成率。尽管矩阵完成率问题在过去十年中引起了很大关注,但在观测受到重尾噪音污染时,仍然缺乏理论理解。 先前的理论没有解释经验结果,无法捕捉估计错误对噪音水平的最佳依赖性。 在本文中,我们采用了适应性哈勃损失,以适应重尾噪音。 当损失函数的参数经过仔细设计,以平衡高度集中的偏向和对外端的稳健度时,这种低端的低端布瑞-蒙泰罗矩阵因子化和梯度的典型化,加上强光谱初始化。 我们证明,仅仅在错误分布的第二号条件下,而不是在亚毛松懈的假设下,由拟议算法减少的大规模不对称错误是强大的和可能不对称的错误。 在进行这一精确的模拟模型分析之前,我们提出了有效的非康克斯算算算算算法算法的算法算法算法算出一个重大的精确的后结果。