与Skew Gausian 进程相比的优惠贝耶斯优化 (Preferential Bayesian optimisation with Skew Gaussian Processes)

Bayesian optimisation (BO) is a very effective approach for sequential black-box optimization where direct queries of the objective function are expensive. However, there are cases where the objective function can only be accessed via preference judgments, such as "this is better than that" between two candidate solutions (like in A/B tests or recommender systems). The state-of-the-art approach to Preferential Bayesian Optimization (PBO) uses a Gaussian process to model the preference function and a Bernoulli likelihood to model the observed pairwise comparisons. Laplace's method is then employed to compute posterior inferences and, in particular, to build an appropriate acquisition function. In this paper, we prove that the true posterior distribution of the preference function is a Skew Gaussian Process (SkewGP), with highly skewed pairwise marginals and, thus, show that Laplace's method usually provides a very poor approximation. We then derive an efficient method to compute the exact SkewGP posterior and use it as surrogate model for PBO employing standard acquisition functions (Upper Credible Bound, etc.). We illustrate the benefits of our exact PBO-SkewGP in a variety of experiments, by showing that it consistently outperforms PBO based on Laplace's approximation both in terms of convergence speed and computational time. We also show that our framework can be extended to deal with mixed preferential-categorical BO, typical for instance in smart manufacturing, where binary judgments (valid or non-valid) together with preference judgments are available.

翻译：Bayasian 优化( BO) 是一种非常有效的连续黑盒优化方法( BO), 直接询问目标功能的成本非常昂贵。然而, 有些情况下, 目标函数只能通过偏好判断来获取, 比如“ 这比这好 ”, 在两种候选解决方案( 如 A/ B 测试或建议系统 ) 之间。优于 Bayesian 优化( PBO) 的最先进的方法使用 Gausian 程序来模拟偏好功能, 以及 Bernouloli 模拟所观察到的对比比较的可能性。然后, Laplace 的方法被用于计算远端偏差推断, 特别是构建合适的获取功能。在本文中, 我们证明, 真正的偏差分配是 Skew Gus 进程( SkewGP ), 并且显示, Laplace 方法通常提供非常差的近似。然后, 我们得出一种有效的方法来理解准确的 SkewGP 后期交易, 并用它作为 PBO 的替代模型, 使用标准的标准获取结果的正值。。, 显示我们的 RBO 的正值的正值。