Bayesian optimization (BayesOpt) is a gold standard for query-efficient continuous optimization. However, its adoption for drug design has been hindered by the discrete, high-dimensional nature of the decision variables. We develop a new approach (LaMBO) which jointly trains a denoising autoencoder with a discriminative multi-task Gaussian process head, allowing gradient-based optimization of multi-objective acquisition functions in the latent space of the autoencoder. These acquisition functions allow LaMBO to balance the explore-exploit tradeoff over multiple design rounds, and to balance objective tradeoffs by optimizing sequences at many different points on the Pareto frontier. We evaluate LaMBO on two small-molecule design tasks, and introduce new tasks optimizing \emph{in silico} and \emph{in vitro} properties of large-molecule fluorescent proteins. In our experiments LaMBO outperforms genetic optimizers and does not require a large pretraining corpus, demonstrating that BayesOpt is practical and effective for biological sequence design.
翻译:Bayesian 优化( Bayesian profession (BayesOpt)) 是调试高效连续优化的金标准。 但是,它用于药物设计受到决定变量的离散、高维特性的阻碍。 我们开发了一种新的方法(LaMBO),用具有歧视性的多任务高斯进程头来联合培训一个解密自动编码器,允许在自动编码器的潜空中以梯度为基础优化多目标获取功能。 这些获取功能使LAMBO能够平衡多个设计回合的探索-开发取舍,并通过优化Pareto边界许多不同点的序列来平衡客观取舍。 我们在两个小分子设计任务上对LAMBO进行评估, 并引入新的任务, 优化大型分子荧光荧光蛋白蛋白的特性。 在我们的实验中,LAMBOBO 超越基因优化器, 不需要一个大型培训前装置, 表明Bayesoptrot对于生物序列设计是实用和有效的。