This paper considers general rank-constrained optimization problems that minimize a general objective function $f(X)$ over the set of rectangular $n\times m$ matrices that have rank at most $r$. To tackle the rank constraint and also to reduce the computational burden, we factorize $X$ into $UV^T$ where $U$ and $V$ are $n\times r$ and $m\times r$ matrices, respectively, and then optimize over the small matrices $U$ and $V$. We characterize the global optimization geometry of the nonconvex factored problem and show that the corresponding objective function satisfies the robust strict saddle property as long as the original objective function $f$ satisfies restricted strong convexity and smoothness properties, ensuring global convergence of many local search algorithms (such as noisy gradient descent) in polynomial time for solving the factored problem. We also provide a comprehensive analysis for the optimization geometry of a matrix factorization problem where we aim to find $n\times r$ and $m\times r$ matrices $U$ and $V$ such that $UV^T$ approximates a given matrix $X^\star$. Aside from the robust strict saddle property, we show that the objective function of the matrix factorization problem has no spurious local minima and obeys the strict saddle property not only for the exact-parameterization case where $rank(X^\star) = r$, but also for the over-parameterization case where $rank(X^\star) < r$ and the under-parameterization case where $rank(X^\star) > r$. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) converge to a global solution with random initialization.
翻译:本文考虑了一般的等级限制优化问题, 这些问题在全方位目标函数上将美元( X) 美元与一组非康维x因子问题的全球优化几何性格特征, 并显示相应的目标功能满足强固的马鞍属性, 只要原始目标功能 $f满足 限制强固态和顺畅性, 我们将X美元计为美元, 美元和美元分别为美元美元和美元, 并随后优化于小基质 美元和美元。 我们还对一个基质因子的优化几何性能问题进行了全面分析, 我们的目标是在非康维因因因子问题下找到 美元和美元为美元。 相应的目标功能满足了坚固的严格马鞍属性, 只要原始目标功能 $( f$) 满足了强固性固性固性软性软性能和平滑性性能, 保证多种本地搜索算( 如粗度的梯度渐渐渐渐渐渐下降) 。 我们还对一个基质的精度测度问题进行了全面分析, 我们的目标是在基质- RO 美元 美元 和 美元 美元 直基质 直基质 直基质 直基质 直基质 运行 运行 直基质 运行 直 运行 运行 运行 直基质 直 运行 运行 运行 直 直 直 直 直 直 直 。