In this paper, we explore a multi-agent reinforcement learning approach to address the design problem of communication and control strategies for multi-agent cooperative transport. Typical end-to-end deep neural network policies may be insufficient for covering communication and control; these methods cannot decide the timing of communication and can only work with fixed-rate communications. Therefore, our framework exploits event-triggered architecture, namely, a feedback controller that computes the communication input and a triggering mechanism that determines when the input has to be updated again. Such event-triggered control policies are efficiently optimized using a multi-agent deep deterministic policy gradient. We confirmed that our approach could balance the transport performance and communication savings through numerical simulations.
翻译:在本文中,我们探索了一种多试剂强化学习方法,以解决多试剂合作运输的通信和控制战略的设计问题。典型的端到端深神经网络政策可能不足以涵盖通信和控制;这些方法不能决定通信的时间安排,只能用固定费率通信。因此,我们的框架利用了事件触发的结构,即计算通信投入的反馈控制器和决定投入何时必须更新的触发机制。这种事件触发的控制政策利用多试剂深度确定型政策梯度来有效优化。我们确认,我们的方法可以通过数字模拟来平衡运输业绩和通信节省。