High-dimensional nonconvex problems are popular in today's machine learning and statistical genetics research. Recently, Ghadimi and Lan [1] proposed an algorithm to optimize nonconvex high-dimensional problems. There are several parameters in their algorithm that are to be set before running the algorithm. It is not trivial how to choose these parameters nor there is, to the best of our knowledge, an explicit rule on how to select the parameters to make the algorithm converges faster. We analyze Ghadimi and Lan's algorithm to gain an interpretation based on the inequality constraints for convergence and the upper bound for the norm of the gradient analogue. Our interpretation of their algorithm suggests this to be a damped Nesterov's acceleration scheme. Based on this, we propose an approach on how to select the parameters to improve convergence of the algorithm. Our numerical studies using high-dimensional nonconvex sparse learning problems, motivated by image denoising and statistical genetics applications, show that convergence can be made, on average, considerably faster than that of the conventional ISTA algorithm for such optimization problems with over $10000$ variables should the parameters be chosen using our proposed approach.
翻译:在今天的机器学习和统计遗传学研究中,高维非convex问题很受欢迎。 最近, Ghadimi 和 Lan [1] 提出了优化非convex高维问题的算法。 算法中有几个参数在算法运行前要设定。 如何选择这些参数并不微不足道, 并且根据我们的知识, 如何选择参数使算法更快地融合的参数有一条明确的规则。 我们分析Ghadimi 和 Lan 的算法, 以便根据对趋同的不平等限制和梯度模拟规范的上限进行解释。 我们对它们的算法的解释表明,这是个迷住Nesterov的加速计划。 在此基础上,我们提出了如何选择参数来改进算法趋同的方法。 我们使用高维非convex分散的学习问题进行的数字研究, 其动机是图像解密和统计遗传学应用, 表明,如果使用我们提议的计算方法选择参数, 则可以比常规ISTA算法在1 000多美元的变数方面平均更快地实现趋同。