Motivated by recent increased interest in optimization algorithms for non-convex optimization in application to training deep neural networks and other optimization problems in data analysis, we give an overview of recent theoretical results on global performance guarantees of optimization algorithms for non-convex optimization. We start with classical arguments showing that general non-convex problems could not be solved efficiently in a reasonable time. Then we give a list of problems that can be solved efficiently to find the global minimizer by exploiting the structure of the problem as much as it is possible. Another way to deal with non-convexity is to relax the goal from finding the global minimum to finding a stationary point or a local minimum. For this setting, we first present known results for the convergence rates of deterministic first-order methods, which are then followed by a general theoretical analysis of optimal stochastic and randomized gradient schemes, and an overview of the stochastic first-order methods. After that, we discuss quite general classes of non-convex problems, such as minimization of $\alpha$-weakly-quasi-convex functions and functions that satisfy Polyak--Lojasiewicz condition, which still allow obtaining theoretical convergence guarantees of first-order methods. Then we consider higher-order and zeroth-order/derivative-free methods and their convergence rates for non-convex optimization problems.
翻译:由于最近人们更加关注在应用中为培训深神经网络和数据分析中的其他优化问题而对非凝固优化优化优化非凝固器优化算法的兴趣增加,因此,我们概述了最近关于优化非凝固器优化算法的全球性能保障的理论结果。我们首先从典型的论据开始,表明一般非凝固器问题不可能在合理的时间内得到有效解决。然后我们提出一系列问题清单,通过尽可能利用问题的结构来有效地找到全球最小化的问题。处理非凝固的另一个办法是,从找到全球最低值到找到固定点或当地最低值,放松目标。对于这一背景,我们首先提出确定性一级方法趋同率的已知结果,然后对最佳凝固和随机化梯度计划进行全面的理论分析,并概述各种随机处理方法。之后,我们讨论了相当一般性的非凝固器问题,例如尽量减少美元-湿度-可凝固度-凝固度-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小值-最小化法方法的趋固法系的集合/先定点-可满足质-先质-先质-先质-可保证的理论-先态-先质-级-最高压方法的理论-级-级-级-可满足质-最高压-最质-级-可保证的理论-最均态-同步-最高压方法,从而能够满足质态-最均态-最质态-最准-级/最质态-级/最质性-级/最质性-可保证-级/备态-级/后备制-级-可保证的理论-级-级方法,从而能-可调制-级-可满足的理论-级方法-级-级-级-级方法-级-级-可调制-级方法-可调制-可调制-级)方法,从而能-可调制-最的理论-先的理论-最的理论-可调制-最的理论-最高/最的理论-最的理论-最的理论-最准-最的理论-最的理论-最的理论-最准-不制