用于导航超大型组合合成图书馆的高效图形化突变模型 (An efficient graph generative model for navigating ultra-large combinatorial synthesis libraries)

Virtual, make-on-demand chemical libraries have transformed early-stage drug discovery by unlocking vast, synthetically accessible regions of chemical space. Recent years have witnessed rapid growth in these libraries from millions to trillions of compounds, hiding undiscovered, potent hits for a variety of therapeutic targets. However, they are quickly approaching a size beyond that which permits explicit enumeration, presenting new challenges for virtual screening. To overcome these challenges, we propose the Combinatorial Synthesis Library Variational Auto-Encoder (CSLVAE). The proposed generative model represents such libraries as a differentiable, hierarchically-organized database. Given a compound from the library, the molecular encoder constructs a query for retrieval, which is utilized by the molecular decoder to reconstruct the compound by first decoding its chemical reaction and subsequently decoding its reactants. Our design minimizes autoregression in the decoder, facilitating the generation of large, valid molecular graphs. Our method performs fast and parallel batch inference for ultra-large synthesis libraries, enabling a number of important applications in early-stage drug discovery. Compounds proposed by our method are guaranteed to be in the library, and thus synthetically and cost-effectively accessible. Importantly, CSLVAE can encode out-of-library compounds and search for in-library analogues. In experiments, we demonstrate the capabilities of the proposed method in the navigation of massive combinatorial synthesis libraries.

翻译：虚拟的、按需制作的化学图书馆改变了早期药物发现,打开了广大的合成可进入的化学空间区域,从而改变了早期药物发现。近年来,这些图书馆迅速增长,从数百万个化合物增长到数万亿个化合物,隐藏了各种治疗目标的未发现和强大的点击量。然而,它们正在迅速接近一个能够进行明确查点的大小,为虚拟筛选带来了新的挑战。为了克服这些挑战,我们建议综合综合图书馆自成一体的自动化自动编码综合图书馆(CSLVAE) 。提议的基因化模型代表了图书馆,像一个不同、分级组织的数据库。鉴于图书馆的化合物结构,分子编码器可以建立一个检索查询的查询,由分子解密器首先解密其化学反应,随后解密其反应。我们的设计将解析器中的自下回归力最小化,便利生成大型、有效的分子图。我们的方法对超大型合成合成图书馆进行快速和平行的批量的推断,使得早期药物发现中的许多重要应用得以进行。因此,分子解算器的分子解算器将用来重建化合物在合成图书馆和可获取的合成系统中进行检索。因此,保证在可获取的图书馆和可检索的图书馆中,可获取的图书馆和可检索中,可检索的图书馆中,可检索的图书馆的合成的图书馆和可检索。