Preferential Bayesian optimisation (PBO) deals with optimisation problems where the objective function can only be accessed via preference judgments, such as "this is better than that" between two candidate solutions (like in A/B tests or recommender systems). The state-of-the-art approach to PBO uses a Gaussian process to model the preference function and a Bernoulli likelihood to model the observed pairwise comparisons. Laplace's method is then employed to compute posterior inferences and, in particular, to build an appropriate acquisition function. In this paper, we prove that the true posterior distribution of the preference function is a Skew Gaussian Process (SkewGP), with highly skewed pairwise marginals and, thus, show that Laplace's method usually provides a very poor approximation. We then derive an efficient method to compute the exact SkewGP posterior and use it as surrogate model for PBO employing standard acquisition functions (Upper Credible Bound, etc.). We illustrate the benefits of our exact PBO-SkewGP in a variety of experiments, by showing that it consistently outperforms PBO based on Laplace's approximation both in terms of convergence speed and computational time. We also show that our framework can be extended to deal with mixed preferential-categorical BO, where binary judgments (valid or non-valid) together with preference judgments are available.
翻译:(PBO) 处理优化问题, 目标函数只能通过偏好判断才能获得, 比如“ 这比两个候选解决方案( 如 A/ B 测试或建议系统 ) 之间的“ 更好 ” 。 PBO 最先进的方法使用高斯进程来模拟优惠功能, 伯努利则有可能模拟观察到的对比比较。 Laplace 的方法随后被用于计算后演推推推, 特别是构建一个合适的获取功能。 在本文中, 我们证明, 优惠功能的真正后演分配是Skeew Gaussian 进程( SkewGP ), 并且具有高度扭曲的双向边际, 因此, 显示Laplat 方法通常提供非常差的近似性。 然后我们用一种有效的方法来计算准确的 SkeewGP 后演和模型, 并用作 PBO 使用标准获取功能( Upper Credibelbound Bound 等) 的代谢模式。 我们用非PBO- SkewGPlationalalal 协议展示了我们精确的利差的利, 和BO 的混合化框架, 显示它可以持续地显示它的速度化框架 。