AI-assisted molecular optimization is a very active research field as it is expected to provide the next-generation drugs and molecular materials. An important difficulty is that the properties to be optimized rely on costly evaluations. Machine learning methods are investigated with success to predict these properties, but show generalization issues on less known areas of the chemical space. We propose here a surrogate-based black box optimization method, to tackle jointly the optimization and machine learning problems. It consists in optimizing the expected improvement of the surrogate of a molecular property using an evolutionary algorithm. The surrogate is defined as a Gaussian Process Regression (GPR) model, learned on a relevant area of the search space with respect to the property to be optimized. We show that our approach can successfully optimize a costly property of interest much faster than a purely metaheuristic approach.
翻译:AI-辅助分子优化是一个非常积极的研究领域,因为预计它将提供下一代药物和分子材料。一个重要的困难是,要优化的特性取决于代价高昂的评价。对机械学习方法进行了成功的调查,以成功地预测这些特性,但显示化学空间不太为人知的领域的一般问题。我们在此建议一种以代孕为基础的黑盒优化方法,以共同解决优化和机器学习问题。它包括利用进化算法优化对分子属性替代物的预期改进。代孕被定义为高斯进程回归模型,在相关搜索空间领域学习关于要优化的财产的模型。我们表明,我们的方法可以比纯粹的计量经济学方法更快速地成功优化昂贵的利益财产。