The ability to direct a Probabilistic Boolean Network (PBN) to a desired state is important to applications such as targeted therapeutics in cancer biology. Reinforcement Learning (RL) has been proposed as a framework that solves a discrete-time optimal control problem cast as a Markov Decision Process. We focus on an integrative framework powered by a model-free deep RL method that can address different flavours of the control problem (e.g., with or without control inputs; attractor state or a subset of the state space as the target domain). The method is agnostic to the distribution of probabilities for the next state, hence it does not use the probability transition matrix. The time complexity is linear on the time steps, or interactions between the agent (deep RL) and the environment (PBN), during training. Indeed, we explore the scalability of the deep RL approach to (set) stabilization of large-scale PBNs and demonstrate successful control on large networks, including a metastatic melanoma PBN with 200 nodes.
翻译:将概率性布尔网络(PBN)引导到理想状态的能力对于癌症生物学中定向治疗治疗等应用非常重要。 强化学习(RL)已被提议为解决离散时间最佳控制问题的框架,作为Markov 决策程序。 我们侧重于一个综合框架,其动力是无模型深度布尔网络(PBN)方法,它能够解决控制问题的不同味道(例如,有或没有控制投入;吸引者状态或国家空间的一部分作为目标域)。该方法对于下一个状态的概率分布具有不可知性,因此它不使用概率过渡矩阵。时间步骤的时间复杂性是线性,或者在培训期间,代理器(Eep RL)与环境(PBN)之间的相互作用。事实上,我们探索深度RL方法在(设置)大规模PBNs稳定大型PBNs和显示对大型网络的成功控制,包括200个节点的向向向导的PBNM。