We developed Distilled Graph Attention Policy Network (DGAPN), a reinforcement learning model to generate novel graph-structured chemical representations that optimize user-defined objectives by efficiently navigating a physically constrained domain. The framework is examined on the task of generating molecules that are designed to bind, noncovalently, to functional sites of SARS-CoV-2 proteins. We present a spatial Graph Attention (sGAT) mechanism that leverages self-attention over both node and edge attributes as well as encoding the spatial structure -- this capability is of considerable interest in synthetic biology and drug discovery. An attentional policy network is introduced to learn the decision rules for a dynamic, fragment-based chemical environment, and state-of-the-art policy gradient techniques are employed to train the network with stability. Exploration is driven by the stochasticity of the action space design and the innovation reward bonuses learned and proposed by random network distillation. In experiments, our framework achieved outstanding results compared to state-of-the-art algorithms, while reducing the complexity of paths to chemical synthesis.
翻译:我们开发了蒸馏图形关注政策网络(DGAPN),这是一个强化学习模型,以产生新的图形结构化化学表示,通过有效浏览物理限制领域,优化用户定义的目标。这个框架是关于生成分子的任务的,目的是将非相近地结合到SARS-CoV-2蛋白质的功能地点。我们提出了一个空间图形关注机制,利用对节点和边缘属性的自我关注和对空间结构进行编码 -- -- 这一能力在合成生物学和毒品发现方面有着相当大的兴趣。引入了一个关注政策网络,以学习动态的、基于碎片的化学环境的决策规则,并采用最先进的政策梯度技术对网络进行稳定的培训。探索是由行动空间的设计的随机网络蒸馏所学到的创新奖项和创新奖项驱动的。在实验中,我们的框架取得了与最新算法的杰出成果,同时降低了化学合成途径的复杂性。