In this paper, we present a solution to a design problem of control strategies for multi-agent cooperative transport. Although existing learning-based methods assume that the number of agents is the same as that in the training environment, the number might differ in reality considering that the robots' batteries may completely discharge, or additional robots may be introduced to reduce the time required to complete a task. Therefore, it is crucial that the learned strategy be applicable to scenarios wherein the number of agents differs from that in the training environment. In this paper, we propose a novel multi-agent reinforcement learning framework of event-triggered communication and consensus-based control for distributed cooperative transport. The proposed policy model estimates the resultant force and torque in a consensus manner using the estimates of the resultant force and torque with the neighborhood agents. Moreover, it computes the control and communication inputs to determine when to communicate with the neighboring agents under local observations and estimates of the resultant force and torque. Therefore, the proposed framework can balance the control performance and communication savings in scenarios wherein the number of agents differs from that in the training environment. We confirm the effectiveness of our approach by using a maximum of eight and six robots in the simulations and experiments, respectively.
翻译:在本文中,我们提出了多剂合作运输控制战略设计问题的解决办法。虽然现有的基于学习的方法假定代理人的数量与培训环境中的相同,但考虑到机器人的电池可能完全排放,或者可以采用更多的机器人来减少完成任务所需的时间,因此,重要的是,学习的战略应适用于代理人数量与培训环境不同的情况。在本文件中,我们提议了一个新的多剂强化学习框架,用于事件触发通信和基于共识的分布式合作运输控制。拟议的政策模型以协商一致的方式估计产生的力量和振荡,使用产生的力量的估计数,并与周边的代理商一道进行;此外,它将控制和通信投入结合起来,以确定何时根据当地观察和对结果力量和压力的估计与邻接头的代理商进行沟通。因此,拟议的框架可以平衡控制业绩和通信节余,在那些情况下,代理人数量与培训环境不同的情况。我们确认我们的方法的有效性,分别使用8个机器人和6个机器人进行最高程度的模拟和6个机器人。我们确认我们的方法的有效性。