In decentralized multi-robot navigation, the agents lack the world knowledge to make safe and (near-)optimal plans reliably and make their decisions on their neighbors' observable states. We present a reinforcement learning based multi-agent navigation algorithm that performs inter-agent communications. In order to deal with the variable number of neighbors for each agent, we use a multi-head self-attention mechanism to encode neighbor information and create a fixed-length observation vector. We pose communication selection as a link prediction problem, where the network predicts whether communication is necessary given the observable information. The communicated information augments the observed neighbor information and is used to select a suitable navigation plan. We highlight the benefits of our approach by performing safe and efficient navigation among multiple robots in dense and challenging benchmarks. We also compare the performance with other learning-based methods and highlight improvements in terms of fewer collisions and time-to-goal in dense scenarios.
翻译:在分散式多机器人导航中,代理商缺乏世界知识,无法可靠地制定安全和(近距离)最佳计划,并就邻居的可观测状态作出决定。我们展示了基于强化学习的多试剂导航算法,可以进行代理商之间的通信。为了处理每个代理商的相邻次数,我们使用多头自省机制来编码邻居信息并创建固定长度的观测矢量。我们把通信选择作为一个联系预测问题,因为网络预测通信是否有必要提供可观测信息。传递的信息增加了所观测到的邻居信息,并用于选择合适的导航计划。我们强调我们的方法的好处,即以密集和具有挑战性的基准在多个机器人之间进行安全和高效的导航。我们还将业绩与其他基于学习的方法进行比较,并突出在密集情况下减少碰撞和时间对目标的改进。