We developed Distilled Graph Attention Policy Networks (DGAPNs), a curiosity-driven reinforcement learning model to generate novel graph-structured chemical representations that optimize user-defined objectives by efficiently navigating a physically constrained domain. The framework is examined on the task of generating molecules that are designed to bind, noncovalently, to functional sites of SARS-CoV-2 proteins. We present a spatial Graph Attention Network (sGAT) that leverages self-attention over both node and edge attributes as well as encoding spatial structure -- this capability is of considerable interest in areas such as molecular and synthetic biology and drug discovery. An attentional policy network is then introduced to learn decision rules for a dynamic, fragment-based chemical environment, and state-of-the-art policy gradient techniques are employed to train the network with enhanced stability. Exploration is efficiently encouraged by incorporating innovation reward bonuses learned and proposed by random network distillation. In experiments, our framework achieved outstanding results compared to state-of-the-art algorithms, while increasing the diversity of proposed molecules and reducing the complexity of paths to chemical synthesis.
翻译:我们开发了蒸馏成像关注政策网络(DGAPNs),这是一个由好奇心驱动的强化学习模型,以产生新的图形结构化化学表现模型,通过高效地浏览物理限制领域,优化用户定义的目标。这个框架是关于生成分子的任务的,这些分子的设计是为了将SARS-CoV-2蛋白质的功能性地点捆绑在一起,而不是相形见绌。我们提出了一个空间成像关注网络(SGAT),对节点和边缘属性以及空间结构进行自我保护。这个能力在分子和合成生物学和药物发现等领域具有相当大的意义。随后引入了一个关注性政策网络,以学习动态的、以碎片为基础的化学环境的决策规则,并采用最先进的政策梯度技术对网络进行更稳定的培训。通过随机网络蒸馏而获得的创新奖励奖金和提议,我们的研究得到了有效的鼓励。在实验中,我们的框架取得了与最新算法相比的突出的结果,同时增加了拟议分子的多样性,减少了化学合成路径的复杂性。