We present a framework for model-free learning of event-triggered control strategies. Event-triggered methods aim to achieve high control performance while only closing the feedback loop when needed. This enables resource savings, e.g., network bandwidth if control commands are sent via communication networks, as in networked control systems. Event-triggered controllers consist of a communication policy, determining when to communicate, and a control policy, deciding what to communicate. It is essential to jointly optimize the two policies since individual optimization does not necessarily yield the overall optimal solution. To address this need for joint optimization, we propose a novel algorithm based on hierarchical reinforcement learning. The resulting algorithm is shown to accomplish high-performance control in line with resource savings and scales seamlessly to nonlinear and high-dimensional systems. The method's applicability to real-world scenarios is demonstrated through experiments on a six degrees of freedom real-time controlled manipulator. Further, we propose an approach towards evaluating the stability of the learned neural network policies.
翻译:我们提出了一个对事件触发控制战略进行无模式学习的框架。 事件触发方法的目的是实现高控制性能, 必要时只关闭反馈循环。 这样可以节省资源, 例如, 如果控制指令通过通信网络发送, 则可以节省网络带宽, 如网络控制系统那样。 事件触发控制器包括通信政策, 决定何时沟通, 以及控制政策, 决定什么可以沟通。 共同优化两种政策至关重要, 因为个人优化不一定产生总体最佳解决方案。 为了满足这种联合优化的需要, 我们提议基于等级强化学习的新算法。 由此产生的算法显示, 与资源节约相一致, 并且无缝无缝到非线性和高维系统。 方法对现实世界情景的适用性通过自由实时控制操纵器的六度实验得到证明。 此外, 我们提议了一种方法来评估所学神经网络政策的稳定性。