用政策搜索应用程序实现安全、非光滑黑盒优化 (Safe non-smooth black-box optimization with application to policy search)

For safety-critical black-box optimization tasks, observations of the constraints and the objective are often noisy and available only for the feasible points. We propose an approach based on log barriers to find a local solution of a non-convex non-smooth black-box optimization problem $\min f^0(x)$ subject to $f^i(x)\leq 0,~ i = 1,\ldots, m$, at the same time, guaranteeing constraint satisfaction while learning an optimal solution with high probability. Our proposed algorithm exploits noisy observations to iteratively improve on an initial safe point until convergence. We derive the convergence rate and prove safety of our algorithm. We demonstrate its performance in an application to an iterative control design problem.

翻译：对于安全临界黑盒优化任务,观察限制和目标往往很吵,而且只能用于可行点。我们建议一种基于日志屏障的方法,以寻找当地办法解决非convex非mootbox黑盒优化问题$\min f ⁇ 0(x)$,但需遵守$f ⁇ i(x)\leq 0,~~i=1\ldots,m$,同时保证限制满意度,同时学习一种极有可能的最佳解决方案。我们提议的算法利用噪音观测在初始安全点上迭接地改进,直到汇合。我们得出趋同率,并证明我们的算法安全性。我们用一个应用来证明它的性能与迭代控制设计问题有关。