A fundamental problem in computational chemistry is to find a set of reactants to synthesize a target molecule, a.k.a. retrosynthesis prediction. Existing state-of-the-art methods rely on matching the target molecule with a large set of reaction templates, which are very computationally expensive and also suffer from the problem of coverage. In this paper, we propose a novel template-free approach called G2Gs by transforming a target molecular graph into a set of reactant molecular graphs. G2Gs first splits the target molecular graph into a set of synthons by identifying the reaction centers, and then translates the synthons to the final reactant graphs via a variational graph translation framework. Experimental results show that G2Gs significantly outperforms existing template-free approaches by up to 63% in terms of the top-1 accuracy and achieves a performance close to that of state-of-the-art template based approaches, but does not require domain knowledge and is much more scalable.
翻译:计算化学的一个根本问题是找到一组反应器来合成目标分子, a.k.a.a. 反转合成预测。 现有最先进的方法依赖于将目标分子与一大批反应模板匹配,这些模板在计算上非常昂贵,也存在覆盖问题。 在本文中,我们提出了一个名为G2G的新颖的无模板方法,将目标分子图转换成一组反应分子图。 G2Gs首先通过识别反应中心将目标分子图分割成一组合成图,然后通过变异图形翻译框架将合成图转换为最后反应图表。 实验结果表明,G2Gs大大超越了现有无模板方法,在顶层-1精确度方面达到63%,并且达到接近于以新式模板为基础的方法的性能,但不需要域知识,而且更具有可扩展性。