AB-Mapper: 关注和基于 BicNet 的动态人群环境多试剂路径 (AB-Mapper: Attention and BicNet Based Multi-agent Path Finding for Dynamic Crowded Environment)

Multi-agent path finding in dynamic crowded environments is of great academic and practical value for multi-robot systems in the real world. To improve the effectiveness and efficiency of communication and learning process during path planning in dynamic crowded environments, we introduce an algorithm called Attention and BicNet based Multi-agent path planning with effective reinforcement (AB-Mapper)under the actor-critic reinforcement learning framework. In this framework, on the one hand, we utilize the BicNet with communication function in the actor-network to achieve intra team coordination. On the other hand, we propose a centralized critic network that can selectively allocate attention weights to surrounding agents. This attention mechanism allows an individual agent to automatically learn a better evaluation of actions by also considering the behaviours of its surrounding agents. Compared with the state-of-the-art method Mapper,our AB-Mapper is more effective (85.86% vs. 81.56% in terms of success rate) in solving the general path finding problems with dynamic obstacles. In addition, in crowded scenarios, our method outperforms the Mapper method by a large margin,reaching a stunning gap of more than 40% for each experiment.

翻译：在活跃的拥挤环境中发现多试剂路径对于现实世界的多机器人系统具有巨大的学术和实际价值。为了在动态的拥挤环境中改进路径规划过程中通信和学习过程的效能和效率,我们引入了一种算法,即“注意”和基于BicNet的多试剂路径规划,并在演员-批评强化学习框架下进行有效强化(AB-Mapper)。在这个框架内,一方面,我们利用在行为者-网络中具有通信功能的BicNet实现团队内部协调。另一方面,我们建议建立一个集中的批评网络,可以有选择地将注意力分给周围的代理人。这一注意机制允许个体代理人通过同时考虑其周围代理人的行为,自动学习对行动的更好评价。与最先进的方法Mapper相比,我们的AB-Mapper在解决总路径中发现动态障碍的问题方面更有效(82.86%比81.56%)。此外,在拥挤的情景中,我们的方法比地图绘制者的方法高出很大空间,每个实验的惊人差距超过40%。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

深度强化学习探索算法最新综述，近200篇文献揭示挑战和未来方向

专知会员服务

83+阅读 · 2021年11月11日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日