Protein structure prediction is a fundamental problem in computational molecular biology. Classical algorithms such as ab-initio or threading as well as many learning methods have been proposed to solve this challenging problem. However, most reinforcement learning methods tend to model the state-action pairs as discrete objects. In this paper, we develop a reinforcement learning (RL) framework in a continuous setting and based on a stochastic parametrized Hamiltonian version of the Pontryagin maximum principle (PMP) to solve the side-chain packing and protein-folding problem. For special cases our formulation can be reduced to previous work where the optimal folding trajectories are trained using an explicit use of Langevin dynamics. Optimal continuous stochastic Hamiltonian dynamics folding pathways can be derived with use of different models of molecular energetics and force fields. In our RL implementation we adopt a soft actor-critic methodology however we can replace this other RL training based on A2C, A3C or PPO.
翻译:蛋白质结构预测是计算分子生物学中的一个基本问题。 已经提出了诸如AB- initio或线条等古典算法以及许多学习方法来解决这一具有挑战性的问题。 但是,大多数强化学习方法倾向于将州- 行动对子作为离散对象的模式。 在本文中,我们开发了一个连续的强化学习框架(RL ), 并且基于一种随机化的平衡的汉密尔顿式最大原则(PMP ), 以解决侧链包装和蛋白质堆积问题。 对于特殊情况, 我们的配方可以缩减为以前的工作, 即利用明确的Langevin动力来培训最佳折叠轨器。 最佳的连续同步汉密尔顿动力折叠路径可以通过使用不同的分子能和力场模型来产生。 在我们的RL 实施中,我们采用了软的动作- 加速方法, 但是我们可以取代基于 A2C、 A3C 或 PPPPPPO 的这种其他 RL 培训。