This paper presents LEMURS, an algorithm for learning scalable multi-robot control policies from cooperative task demonstrations. We propose a port-Hamiltonian description of the multi-robot system to exploit universal physical constraints in interconnected systems and achieve closed-loop stability. We represent a multi-robot control policy using an architecture that combines self-attention mechanisms and neural ordinary differential equations. The former handles time-varying communication in the robot team, while the latter respects the continuous-time robot dynamics. Our representation is distributed by construction, enabling the learned control policies to be deployed in robot teams of different sizes. We demonstrate that LEMURS can learn interactions and cooperative behaviors from demonstrations of multi-agent navigation and flocking tasks.
翻译:本文介绍了LEMURS, 这是一种从合作任务演示中学习可扩展的多机器人控制政策的算法。 我们建议对多机器人系统进行港口- Hamiltonian 描述, 以利用互联系统中的普遍物理限制, 实现闭环稳定。 我们代表一种多机器人控制政策, 使用一种将自留机制和神经普通差异方程式相结合的结构。 前者处理机器人团队中的时间分配通信, 而后者尊重连续时间机器人动态。 我们的代表是通过施工分配的, 使得学习过的控制政策能够部署在不同大小的机器人团队中。 我们证明, LEMURS 可以从多剂导航和传球任务演示中学习互动与合作行为 。