We study how Bayesian optimization (BO) can be accelerated on a target task with historical knowledge transferred from related source tasks. Existing works on BO with knowledge transfer either do not have theoretical guarantees or achieve the same regret as BO in the non-transfer setting, $\tilde{\mathcal{O}}(\sqrt{T \gamma_f})$, where $T$ is the number of evaluations of the target function and $\gamma_f$ denotes its information gain. In this paper, we propose the DeltaBO algorithm, in which a novel uncertainty-quantification approach is built on the difference function $\delta$ between the source and target functions, which are allowed to belong to different reproducing kernel Hilbert spaces (RKHSs). Under mild assumptions, we prove that the regret of DeltaBO is of order $\tilde{\mathcal{O}}(\sqrt{T (T/N + \gamma_\delta)})$, where $N$ denotes the number of evaluations from source tasks and typically $N \gg T$. In many applications, source and target tasks are similar, which implies that $\gamma_\delta$ can be much smaller than $\gamma_f$. Empirical studies on both real-world hyperparameter tuning tasks and synthetic functions show that DeltaBO outperforms other baseline methods and support our theoretical claims.
翻译:本文研究如何通过从相关源任务迁移历史知识来加速目标任务上的贝叶斯优化(BO)。现有关于知识迁移的BO研究要么缺乏理论保证,要么在非迁移设置下达到与BO相同的遗憾界$\tilde{\mathcal{O}}(\sqrt{T \gamma_f})$,其中$T$为目标函数的评估次数,$\gamma_f$表示其信息增益。本文提出DeltaBO算法,该算法基于源函数与目标函数之间的差异函数$\delta$构建了一种新颖的不确定性量化方法,允许两者属于不同的再生核希尔伯特空间(RKHSs)。在温和假设下,我们证明DeltaBO的遗憾阶为$\tilde{\mathcal{O}}(\sqrt{T (T/N + \gamma_\delta)})$,其中$N$表示源任务的评估次数,通常满足$N \gg T$。在许多应用中,源任务与目标任务具有相似性,这意味着$\gamma_\delta$可能远小于$\gamma_f$。在真实世界超参数调优任务和合成函数上的实证研究表明,DeltaBO优于其他基线方法,并支持我们的理论主张。