Bayesian optimization is a technique for optimizing black-box target functions. At the core of Bayesian optimization is a surrogate model that predicts the output of the target function at previously unseen inputs to facilitate the selection of promising input values. Gaussian processes (GPs) are commonly used as surrogate models but are known to scale poorly with the number of observations. We adapt the Vecchia approximation, a popular GP approximation from spatial statistics, to enable scalable high-dimensional Bayesian optimization. We develop several improvements and extensions, including training warped GPs using mini-batch gradient descent, approximate neighbor search, and selecting multiple input values in parallel. We focus on the use of our warped Vecchia GP in trust-region Bayesian optimization via Thompson sampling. On several test functions and on two reinforcement-learning problems, our methods compared favorably to the state of the art.
翻译:Bayesian 优化是优化黑盒目标功能的一种技术。 在Bayesian优化的核心是一个替代模型,它预测目标功能在先前不为人知的投入值中的产出,以便于选择有希望的投入值。 Gaussian 进程(GPs)通常用作替代模型,但据知与观测数量相比规模不高。 我们从空间统计中调整了流行的GP近似值Vecchia, 以便实现可缩放的高维度Bayesian 优化。 我们开发了一些改进和扩展, 包括使用小型的梯度下降、近似邻居搜索和平行选择多个输入值来培训扭曲的GPs。 我们侧重于在信任地区的Bayesian 优化中使用我们的Wecchia GPs。 关于几个测试功能和两个强化学习问题,我们的方法与艺术状态相比是有利的。