We study first-order methods with preconditioning for solving structured nonlinear convex optimization problems. We propose a new family of preconditioners generated by symmetric polynomials. They provide first-order optimization methods with a provable improvement of the condition number, cutting the gaps between highest eigenvalues, without explicit knowledge of the actual spectrum. We give a stochastic interpretation of this preconditioning in terms of coordinate volume sampling and compare it with other classical approaches, including the Chebyshev polynomials. We show how to incorporate a polynomial preconditioning into the Gradient and Fast Gradient Methods and establish the corresponding global complexity bounds. Finally, we propose a simple adaptive search procedure that automatically chooses the best possible polynomial preconditioning for the Gradient Method, minimizing the objective along a low-dimensional Krylov subspace. Numerical experiments confirm the efficiency of our preconditioning strategies for solving various machine learning problems.
翻译:我们研究一阶方法,其先决条件是解决结构化的非线性锥形优化问题。我们建议建立一个由对称多元度生成的先决条件组成的新体系。它们为一阶优化方法提供了可辨别的条件号改进,缩小了最高电子值之间的差距,而没有明确了解实际频谱。我们从协调体积取样的角度对这一先决条件进行随机解释,并将其与其他古典方法,包括Chebyshev多元米亚仪进行比较。我们展示了如何将一个多元前提纳入渐进和快速梯度方法,并设定相应的全球复杂性界限。最后,我们提出了一个简单的适应性搜索程序,自动选择“梯度方法”的最佳多元性先决条件,在低维度Krylov子空间上将目标最小化。数字实验证实了我们解决各种机器学习问题的先决条件战略的效率。