Optimizing expensive-to-evaluate black-box functions of discrete (and potentially continuous) design parameters is a ubiquitous problem in scientific and engineering applications. Bayesian optimization (BO) is a popular, sample-efficient method that leverages a probabilistic surrogate model and an acquisition function (AF) to select promising designs to evaluate. However, maximizing the AF over mixed or high-cardinality discrete search spaces is challenging standard gradient-based methods cannot be used directly or evaluating the AF at every point in the search space would be computationally prohibitive. To address this issue, we propose using probabilistic reparameterization (PR). Instead of directly optimizing the AF over the search space containing discrete parameters, we instead maximize the expectation of the AF over a probability distribution defined by continuous parameters. We prove that under suitable reparameterizations, the BO policy that maximizes the probabilistic objective is the same as that which maximizes the AF, and therefore, PR enjoys the same regret bounds as the original BO policy using the underlying AF. Moreover, our approach provably converges to a stationary point of the probabilistic objective under gradient ascent using scalable, unbiased estimators of both the probabilistic objective and its gradient. Therefore, as the number of starting points and gradient steps increase, our approach will recover of a maximizer of the AF (an often-neglected requisite for commonly used BO regret bounds). We validate our approach empirically and demonstrate state-of-the-art optimization performance on a wide range of real-world applications. PR is complementary to (and benefits) recent work and naturally generalizes to settings with multiple objectives and black-box constraints.
翻译:最佳利用离散(和潜在连续)设计参数的昂贵黑箱功能是科学和工程应用中普遍存在的一个问题。巴伊西亚优化(BO)是一种流行的、抽样效率高的方法,它利用一种概率替代模型和获取功能来选择有希望的设计来进行评估。然而,在混合或高度心性离散搜索空间中最大限度地利用AF而不是混合或高度心性离散搜索空间对标准梯度方法具有挑战性,或者在搜索空间的每一个点上评价AF都难以进行计算。为了解决这一问题,我们建议使用概率再平衡(PR)法来直接优化AF,而不是直接优化含有离散参数的搜索空间,而是在连续参数定义的概率分布上实现AFF的预期最大化。我们证明,在适当的再平衡下,使AFAF的稳定性目标最大化,因此,PR享有与AB的原始政策相同的遗憾。此外,我们在基础的AFAF下,我们的方法可以直接优化AF的AF, 并且可以将AF的更精确性、更精确的Silia-lioralalalal oral oralal oral laview as lavial latical as as 和我们Oral-s to lavial laviolal laviolal-s la-s laviol-我们可以将一个可以开始开始开始一个稳定的轨道, 和一个稳定的轨道平比数到一个稳定的轨道,我们平比点,我们平比平比平比平平比平比平比平平平平比。