We consider the problem of multi-agent navigation and collision avoidance when observations are limited to the local neighborhood of each agent. We propose InforMARL, a novel architecture for multi-agent reinforcement learning (MARL) which uses local information intelligently to compute paths for all the agents in a decentralized manner. Specifically, InforMARL aggregates information about the local neighborhood of agents for both the actor and the critic using a graph neural network and can be used in conjunction with any standard MARL algorithm. We show that (1) in training, InforMARL has better sample efficiency and performance than baseline approaches, despite using less information, and (2) in testing, it scales well to environments with arbitrary numbers of agents and obstacles. We illustrate these results using four task environments, including one with predetermined goals for each agent, and one in which the agents collectively try to cover all goals.
翻译:我们考虑多剂导航和避免碰撞的问题,因为观测仅限于每个物剂的当地环境。我们建议InforMARL,这是一个多剂强化学习的新结构,它以智能方式使用当地信息,以分散的方式计算所有物剂的路径。具体地说,InforMARL利用图形神经网络为行为者和评论家汇总关于当地物剂周围的信息,并可以与任何标准的MARL算法结合使用。我们表明:(1)在培训中,尽管信息较少,但InforMARL比基线方法的样本效率和性能要好,(2)在测试中,它很适合任意数量的物剂和障碍的环境。我们用四个任务环境来说明这些结果,包括每个物剂都有预定目标的一个环境,以及一个使物剂集体试图覆盖所有目标的一个环境。