The paper discusses derivative-free optimization (DFO), which involves minimizing a function without access to gradients or directional derivatives, only function evaluations. Classical DFO methods, which mimic gradient-based methods, such as Nelder-Mead and direct search have limited scalability for high-dimensional problems. Zeroth-order methods have been gaining popularity due to the demands of large-scale machine learning applications, and the paper focuses on the selection of the step size $\alpha_k$ in these methods. The proposed approach, called Curvature-Aware Random Search (CARS), uses first- and second-order finite difference approximations to compute a candidate $\alpha_{+}$. We prove that for strongly convex objective functions, CARS converges linearly provided that the search direction is drawn from a distribution satisfying very mild conditions. We also present a Cubic Regularized variant of CARS, named CARS-CR, which converges in a rate of $\mathcal{O}(k^{-1})$ without the assumption of strong convexity. Numerical experiments show that CARS and CARS-CR match or exceed the state-of-the-arts on benchmark problem sets.
翻译:本文讨论无导数优化(DFO),其涉及在没有梯度或方向导数的情况下最小化函数,只有函数评估。类似于基于梯度的方法的经典DFO方法,如Nelder-Mead和直接搜索,对于高维问题的可扩展性有限。由于大规模机器学习应用的需求,零阶方法越来越受欢迎,本文重点关注这些方法中步长$\alpha_k$的选择。所提出的方法称为曲率感知随机搜索(CARS),使用一阶和二阶有限差分逼近来计算候选$\alpha_{+}$。我们证明,对于强凸目标函数,只要从一个满足非常温和条件的分布中抽取搜索方向,CARS就会线性收敛。我们还提出了CARS-CR的立方正则化变体,该变体无需假设强凸性即可以$\mathcal{O}(k^{-1})$的速度收敛。数值实验表明,CARS和CARS-CR与基准问题集的现有技术相当或超越了它们。