Sparse Gaussian Processes are a key component of high-throughput Bayesian optimisation (BO) loops -- an increasingly common setting where evaluation budgets are large and highly parallelised. By using representative subsets of the available data to build approximate posteriors, sparse models dramatically reduce the computational costs of surrogate modelling by relying on a small set of pseudo-observations, the so-called inducing points, in lieu of the full data set. However, current approaches to design inducing points are not appropriate within BO loops as they seek to reduce global uncertainty in the objective function. Thus, the high-fidelity modelling of promising and data-dense regions required for precise optimisation is sacrificed and computational resources are instead wasted on modelling areas of the space already known to be sub-optimal. Inspired by entropy-based BO methods, we propose a novel inducing point design that uses a principled information-theoretic criterion to select inducing points. By choosing inducing points to maximally reduce both global uncertainty and uncertainty in the maximum value of the objective function, we build surrogate models able to support high-precision high-throughput BO.
翻译:松散的戈斯进程是高通量贝叶斯优化循环(BO)的一个关键组成部分 -- -- 一个越来越常见的环境,在这个环境中,评价预算是巨大和高度平行的。通过使用现有数据的代表性子集来建立近似子体,稀有模型极大地降低了代位模型的计算成本,依靠少量的假观察,即所谓的诱导点,而不是完整的数据集。然而,目前设计引点的方法不适合BO循环,因为它们试图减少客观功能的全球不确定性。因此,精确优化所需的有希望和数据密集区域的高度不洁性建模被牺牲,而计算资源被浪费在已知为次优化的空间的建模领域。我们建议采用新的引点设计,在选择引点时使用有原则的信息理论标准。通过选择引点,最大限度地减少客观功能的最大价值的全球不确定性和不确定性,我们建立替代模型,能够支持高端精度高端的BO。