qEUBO: 基于决策理论的偏好贝叶斯优化收购函数 (qEUBO: A Decision-Theoretic Acquisition Function for Preferential Bayesian Optimization)

Preferential Bayesian optimization (PBO) is a framework for optimizing a decision maker's latent utility function using preference feedback. This work introduces the expected utility of the best option (qEUBO) as a novel acquisition function for PBO. When the decision maker's responses are noise-free, we show that qEUBO is one-step Bayes optimal and thus equivalent to the popular knowledge gradient acquisition function. We also show that qEUBO enjoys an additive constant approximation guarantee to the one-step Bayes-optimal policy when the decision maker's responses are corrupted by noise. We provide an extensive evaluation of qEUBO and demonstrate that it outperforms the state-of-the-art acquisition functions for PBO across many settings. Finally, we show that, under sufficient regularity conditions, qEUBO's Bayesian simple regret converges to zero at a rate $o(1/n)$ as the number of queries, $n$, goes to infinity. In contrast, we show that simple regret under qEI, a popular acquisition function for standard BO often used for PBO, can fail to converge to zero. Enjoying superior performance, simple computation, and a grounded decision-theoretic justification, qEUBO is a promising acquisition function for PBO.

翻译：本文介绍了最佳选项的期望效用(qEUBO)作为偏好贝叶斯优化(PBO)的一种新颖收购函数。当决策者的响应无噪声时，我们显示qEUBO是一步贝叶斯最优的，因此等效于流行的知识梯度收购函数。我们还表明，在决策者的响应受到噪声污染时，qEUBO具有一步贝叶斯最优原则的附加常数逼近保证性。我们对qEUBO进行了广泛的评估，并证明它在许多设置中优于偏好贝叶斯优化的最先进收购函数。最后，我们表明，在足够的规则条件下，随着查询数量(n)趋于无限大，qEUBO的贝叶斯简单失误以$o(1/n)$的速率收敛于零。相比之下，我们还表明，标准BO通常用于PBO的一个流行的收购函数qEI的简单失误可能无法收敛于零。享有卓越性能，简单计算以及基于决策理论的合理理由，qEUBO是PBO的一种有前途的收购函数。