Flocking is a very challenging problem in a multi-agent system; traditional flocking methods also require complete knowledge of the environment and a precise model for control. In this paper, we propose Evolutionary Multi-Agent Reinforcement Learning (EMARL) in flocking tasks, a hybrid algorithm that combines cooperation and competition with little prior knowledge. As for cooperation, we design the agents' reward for flocking tasks according to the boids model. While for competition, agents with high fitness are designed as senior agents, and those with low fitness are designed as junior, letting junior agents inherit the parameters of senior agents stochastically. To intensify competition, we also design an evolutionary selection mechanism that shows effectiveness on credit assignment in flocking tasks. Experimental results in a range of challenging and self-contrast benchmarks demonstrate that EMARL significantly outperforms the full competition or cooperation methods.
翻译:在多试剂系统中,锁定是一个非常具有挑战性的问题;传统的羊群方法还要求对环境有完全的了解,并有一个精确的控制模式。在本文件中,我们提议在群集任务中采用进化多机构强化学习(EMARL),这是一种混合算法,将合作与竞争结合起来,而以前几乎没有什么知识。在合作方面,我们根据群集任务模式设计代理人的奖励。在竞争方面,高体格的代理人被设计为高级代理人,而低体格的代理人被设计为低级,让低级代理人继承高级代理人的参数。为了加强竞争,我们还设计了一个进化选择机制,显示在群集任务中信用分配的有效性。一系列具有挑战性和自相矛盾的基准的实验结果表明,EMARL大大超越了全面竞争或合作方法。