We consider the problem of optimizing combinatorial spaces (e.g., sequences, trees, and graphs) using expensive black-box function evaluations. For example, optimizing molecules for drug design using physical lab experiments. Bayesian optimization (BO) is an efficient framework for solving such problems by intelligently selecting the inputs with high utility guided by a learned surrogate model. A recent BO approach for combinatorial spaces is through a reduction to BO over continuous spaces by learning a latent representation of structures using deep generative models (DGMs). The selected input from the continuous space is decoded into a discrete structure for performing function evaluation. However, the surrogate model over the latent space only uses the information learned by the DGM, which may not have the desired inductive bias to approximate the target black-box function. To overcome this drawback, this paper proposes a principled approach referred as LADDER. The key idea is to define a novel structure-coupled kernel that explicitly integrates the structural information from decoded structures with the learned latent space representation for better surrogate modeling. Our experiments on real-world benchmarks show that LADDER significantly improves over the BO over latent space method, and performs better or similar to state-of-the-art methods.
翻译:我们考虑了利用昂贵的黑盒功能评估优化组合空间(如序列、树木和图形)的问题。例如,利用物理实验室实验优化药物设计分子以优化药物设计中的分子。贝叶斯优化(BO)是解决这类问题的有效框架,通过明智地选择以高实用的替代模型指导的高效用投入。最近BO对组合空间采取的方法是通过利用深基因模型(DGM)学习结构的潜在代表性,从而减少对BO的连续性空间。从连续空间选定的输入被解码成一个独立结构,用于进行功能评估。然而,对潜在空间的替代模型仅使用DGM所学到的信息,而DGM所学的信息可能不具备接近目标黑盒功能的预期感性偏差。为了克服这一缺陷,本文提出了一种称为LADDDER的原则性方法。关键的想法是确定一种新型结构组合的内核,明确将解结构结构结构中的结构信息与所学得的潜在空间代表结构进行整合,以更好地进行覆盖模型。我们在现实世界的实验中,我们对BADDER的模型进行了更好的研究。