Many problems in data science can be treated as estimating a low-rank matrix from highly incomplete, sometimes even corrupted, observations. One popular approach is to resort to matrix factorization, where the low-rank matrix factors are optimized via first-order methods over a smooth loss function, such as the residual sum of squares. While tremendous progresses have been made in recent years, the natural smooth formulation suffers from two sources of ill-conditioning, where the iteration complexity of gradient descent scales poorly both with the dimension as well as the condition number of the low-rank matrix. Moreover, the smooth formulation is not robust to corruptions. In this paper, we propose scaled subgradient methods to minimize a family of nonsmooth and nonconvex formulations -- in particular, the residual sum of absolute errors -- which is guaranteed to converge at a fast rate that is almost dimension-free and independent of the condition number, even in the presence of corruptions. We illustrate the effectiveness of our approach when the observation operator satisfies certain mixed-norm restricted isometry properties, and derive state-of-the-art performance guarantees for a variety of problems such as robust low-rank matrix sensing and quadratic sampling.
翻译:数据科学的许多问题可以被视为从高度不完全、有时甚至是腐败的观察中估算出一个低层次的矩阵。一种流行的方法是采用矩阵因子化,即通过一阶方法优化低层次的矩阵因子因子因子,以达到平稳的损失功能,例如平方的剩余和数。虽然近年来取得了巨大的进步,自然平稳的配方却受到两种不自理的根源的影响,即梯度下层的迭代复杂性在尺寸和低级别矩阵的条件数目上都很差。此外,光滑的配方对腐败并不有力。在本文件中,我们建议采用扩大的次梯度方法,以尽量减少非摩特和非康韦克斯配方的组合 -- -- 尤其是绝对差的剩余总和 -- -- -- 保证以几乎没有尺寸的快速速度趋同,而且独立于条件数目,即使存在腐败。我们说明了我们的方法的有效性,即观察操作者满足某些混合温度受限的特性,并得出诸如稳健的低质矩阵和矩形取样等各种问题的状态性保证。