We apply deep reinforcement learning (DRL) to design of a networked controller with network delays to complete a temporal control task that is described by a signal temporal logic (STL) formula. STL is useful to deal with a specification with a bounded time interval for a dynamical system. In general, an agent needs not only the current system state but also the past behavior of the system to determine a desired control action for satisfying the given STL formula. Additionally, we need to consider the effect of network delays for data transmissions. Thus, we propose an extended Markov decision process using past system states and control actions, which is called a $\tau d$-MDP, so that the agent can evaluate the satisfaction of the STL formula considering the network delays. Thereafter, we apply a DRL algorithm to design a networked controller using the $\tau d$-MDP. Through simulations, we also demonstrate the learning performance of the proposed algorithm.
翻译:我们运用深度强化学习(DRL)来设计网络化控制器,使其在网络上出现延误,以完成用信号时间逻辑(STL)公式描述的时间控制任务。STL对于处理动态系统受约束时间间隔的规格很有用。一般来说,一个代理器不仅需要当前的系统状态,而且需要系统过去的行为来确定满足给定的STL公式所需的控制行动。此外,我们需要考虑网络延迟对数据传输的影响。因此,我们建议使用过去系统状态和控制动作(称为$\tau d$-MDP)来延长Markov决定程序,以便代理器能够评估STL公式的满意度,同时考虑到网络延误。之后,我们应用DRL算法来设计一个使用$\tau d$-MDP的网络化控制器。通过模拟,我们还演示了拟议算法的学习性。