Multi-robot navigation is a challenging task in which multiple robots must be coordinated simultaneously within dynamic environments. We apply deep reinforcement learning (DRL) to learn a decentralized end-to-end policy which maps raw sensor data to the command velocities of the agent. In order to enable the policy to generalize, the training is performed in different environments and scenarios. The learned policy is tested and evaluated in common multi-robot scenarios like switching a place, an intersection and a bottleneck situation. This policy allows the agent to recover from dead ends and to navigate through complex environments.
翻译:多机器人导航是一项具有挑战性的任务,其中多个机器人必须在动态环境中同时协调。我们应用深度强化学习(DRL)来学习一种分散的端对端政策,该政策将原始传感器数据映射到代理人的指令速度上。为了使该政策能够概括化,培训在不同的环境和情景下进行。所学的政策在共同的多机器人假设中进行测试和评价,例如转换一个地方、一个交叉点和一个瓶颈状况。该政策允许代理人从死胡同中恢复并穿越复杂的环境。