Retrosynthetic planning, which aims to find a reaction pathway to synthesize a target molecule, plays an important role in chemistry and drug discovery. This task is usually modeled as a search problem. Recently, data-driven methods have attracted many research interests and shown promising results for retrosynthetic planning. We observe that the same intermediate molecules are visited many times in the searching process, and they are usually independently treated in previous tree-based methods (e.g., AND-OR tree search, Monte Carlo tree search). Such redundancies make the search process inefficient. We propose a graph-based search policy that eliminates the redundant explorations of any intermediate molecules. As searching over a graph is more complicated than over a tree, we further adopt a graph neural network to guide the search over graphs. Meanwhile, our method can search a batch of targets together in the graph and remove the inter-target duplication in the tree-based search methods. Experimental results on two datasets demonstrate the effectiveness of our method. Especially on the widely used USPTO benchmark, we improve the search success rate to 99.47%, advancing previous state-of-the-art performance for 2.6 points.
翻译:重新合成计划旨在寻找合成目标分子的反应路径,在化学和药物发现中起着重要作用。任务通常以搜索问题为模型。最近,数据驱动方法吸引了许多研究兴趣,并展示了反合成规划的希望结果。我们观察到,同样的中间分子在搜索过程中多次被访问,并且通常在以前以树为基础的方法(例如,AND-OR树搜索,蒙特卡洛树搜索)中被独立处理。这种冗余使得搜索过程效率低下。我们提出了一个基于图表的搜索政策,以消除任何中间分子的多余探索。由于对图形的搜索比对树的搜索更加复杂,我们进一步采用了图形神经网络来指导对图形的搜索。与此同时,我们的方法可以同时搜索图中的一系列目标,并消除基于树的搜索方法中的目标重复。两个数据集的实验结果证明了我们的方法的有效性。特别是在广泛使用的USPTO基准上,我们将搜索成功率提高到99.47 %,推进了2.6点以前的状态性业绩。