We present a novel deep reinforcement learning (DRL)-based design of a networked controller with network delays for signal temporal logic (STL) specifications. We consider the case in which both the system dynamics and network delays are unknown. Because the satisfaction of an STL formula is based not only on the current state but also on the behavior of the system, we propose an extension of the Markov decision process (MDP), which is called a $\tau\delta$-MDP, such that we can evaluate the satisfaction of the STL formula under the network delays using the $\tau\delta$-MDP. Thereafter, we construct deep neural networks based on the $\tau\delta$-MDP and propose a learning algorithm. Through simulations, we also demonstrate the learning performance of the proposed algorithm.
翻译:我们提出了一个基于网络化控制器的新型深度强化学习(DRL)设计,其网络化控制器在信号时间逻辑(STL)规格方面出现网络延迟。我们考虑了系统动态和网络延迟都未知的情况。由于STL公式的满意度不仅基于当前状态,而且基于系统行为,我们提议延长Markov决定程序(MDP),称为$tau\delta$-MDP,这样我们就可以使用$tau\delta$-MDP来评估网络下STL公式的满意度。之后,我们根据$tau\delta$-MDP建立深层神经网络,并提出学习算法。我们通过模拟,还展示了拟议算法的学习性能。