Recently, utilizing reinforcement learning (RL) to generate molecules with desired properties has been highlighted as a promising strategy for drug design. A molecular docking program - a physical simulation that estimates protein-small molecule binding affinity - can be an ideal reward scoring function for RL, as it is a straightforward proxy of the therapeutic potential. Still, two imminent challenges exist for this task. First, the models often fail to generate chemically realistic and pharmacochemically acceptable molecules. Second, the docking score optimization is a difficult exploration problem that involves many local optima and less smooth surfaces with respect to molecular structure. To tackle these challenges, we propose a novel RL framework that generates pharmacochemically acceptable molecules with large docking scores. Our method - Fragment-based generative RL with Explorative Experience replay for Drug design (FREED) - constrains the generated molecules to a realistic and qualified chemical space and effectively explores the space to find drugs by coupling our fragment-based generation method and a novel error-prioritized experience replay (PER). We also show that our model performs well on both de novo and scaffold-based schemes. Our model produces molecules of higher quality compared to existing methods while achieving state-of-the-art performance on two of three targets in terms of the docking scores of the generated molecules. We further show with ablation studies that our method, predictive error-PER (FREED(PE)), significantly improves the model performance.
翻译:最近,利用强化学习(RL)生成具有理想特性的分子被强调为药物设计的一个有希望的战略。分子对接程序 — — 物理模拟,估算蛋白-小分子结合的亲和性 — — 可以为RL带来理想的奖赏评分功能,因为它是治疗潜力的直截了当的替代物。然而,这项任务还存在两个迫在眉睫的挑战。首先,模型往往不能产生化学上现实和化学化学上可接受的分子。第二,对接评分优化是一个难以解决的探索问题,涉及许多本地的对接和分子结构方面不太平滑的表面。为了应对这些挑战,我们提出了一个新型的RL框架,用大量对接分生成可接受的美化分子分子。我们的方法 — 基于分裂的基因变异RL,与药物设计探索性能的经验重现(FREED) — 将生成的分子限制在现实和合格的化学空间中,并有效地探索寻找药物的空间,方法是将我们的碎片生成方法与新的错误生成方法相混合(PER) 和新的错误优先重现经验(PER ) 。为了应对这些挑战,我们还提出一个新的RL框架的模型,在我们的模型上做了一个比较的模型,同时,在我们的数学质量和进进进进进进进进进进的模型中,同时,我们进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进的进进进进进进进的进的进的进的模型。