利用无衍生工具的优化和连续强盗的更高秩序的平滑 (Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits)

We study the problem of zero-order optimization of a strongly convex function. The goal is to find the minimizer of the function by a sequential exploration of its values, under measurement noise. We study the impact of higher order smoothness properties of the function on the optimization error and on the cumulative regret. To solve this problem we consider a randomized approximation of the projected gradient descent algorithm. The gradient is estimated by a randomized procedure involving two function evaluations and a smoothing kernel. We derive upper bounds for this algorithm both in the constrained and unconstrained settings and prove minimax lower bounds for any sequential search method. Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters. Based on this algorithm, we also propose an estimator of the minimum value of the function achieving almost sharp oracle behavior. We compare our results with the state-of-the-art, highlighting a number of key improvements.

翻译：我们研究强电流函数的零顺序优化问题。目标是通过在测量噪音下连续探索其值来找到函数最小化的功能。我们研究了函数的更高顺序平稳性能对优化错误和累积遗憾的影响。为了解决这个问题, 我们考虑对预测的梯度下移算法进行随机近似。梯度由包含两个功能评估和平稳内核的随机程序来估计。我们从受限制和不受限制的设置中得出这种算法的上限, 并证明任何顺序搜索方法的最小值下限。我们的结果表明零序算法在样本复杂性和问题参数方面几乎是最佳的。我们还根据这一算法, 提议一个估计函数最小值的估算值, 以达到几乎尖锐或骨灰的行为。我们比较了我们的结果与最新技术的对比, 突出了一些关键改进。