Low-rank and nonsmooth matrix optimization problems capture many fundamental tasks in statistics and machine learning. While significant progress has been made in recent years in developing efficient methods for \textit{smooth} low-rank optimization problems that avoid maintaining high-rank matrices and computing expensive high-rank SVDs, advances for nonsmooth problems have been slow paced. In this paper we consider standard convex relaxations for such problems. Mainly, we prove that under a natural \textit{generalized strict complementarity} condition and under the relatively mild assumption that the nonsmooth objective can be written as a maximum of smooth functions, the \textit{extragradient method}, when initialized with a "warm-start" point, converges to an optimal solution with rate $O(1/t)$ while requiring only two \textit{low-rank} SVDs per iteration. We give a precise trade-off between the rank of the SVDs required and the radius of the ball in which we need to initialize the method. We support our theoretical results with empirical experiments on several nonsmooth low-rank matrix recovery tasks, demonstrating that using simple initializations, the extragradient method produces exactly the same iterates when full-rank SVDs are replaced with SVDs of rank that matches the rank of the (low-rank) ground-truth matrix to be recovered.
翻译:虽然近年来在为低端优化问题制定高效方法以避免维持高端矩阵和高高的高级SVD而避免维持高端矩阵和计算昂贵的高高级SVD的低端优化问题方面取得了显著的进展,但非光度问题的进展却进展缓慢。在本文中,我们只考虑对此类问题的标准松脱。主要是,我们证明,在自然的 & textit{普遍化的严格互补}条件和相对温缓的假设下,在非毛目标可以写成为最顺利功能的最大值的假设下,在近些年在制定高效方法以避免保持高端矩阵和计算高高端SVD的低端低端优化方法方面取得了显著进展。虽然近年来在为避免维持高端矩阵和高端的低端SVD标准放松了这些问题的标准化标准。 当以“温启动点”的起始点初始化结果时,我们支持我们的理论结果,在用美元(1/美元)的汇率中,而只需要两个标准方[lowk}SVDDs 放松的放松等级的等级和我们需要开始方法的中间点之间有一个精确的利差。 我们支持我们的理论-D-D-D-在几个不完全的恢复时,在几个非方法上展示的恢复时,用一些不折后,用一些不折的平的平的平的平的平的平的平的平的平的平的平后,用一些的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平方法是相同的方法是相同的方法是相同的方法是相同的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的平的