Cooperative multi-agent systems are being widely used in different domains. Interaction among agents would bring benefits, including reducing operating costs, high scalability, and facilitating parallel processing. These systems are also a good option for handling large-scale, unknown, and dynamic environments. However, learning in these environments has become a very important challenge in various applications. These challenges include the effect of search space size on learning time, inefficient cooperation among agents, and the lack of proper coordination among agents' decisions. Moreover, reinforcement learning algorithms may suffer from long convergence time in these problems. In this paper, a communication framework using knowledge transfer concepts is introduced to address such challenges in the herding problem with large state space. To handle the problems of convergence, knowledge transfer has been utilized that can significantly increase the efficiency of reinforcement learning algorithms. Coordination between the agents is carried out through a head agent in each group of agents and a coordinator agent respectively. The results demonstrate that this framework could indeed enhance the speed of learning and reduce convergence time.
翻译:在不同领域广泛使用合作性多试剂系统。 代理机构之间的互动将带来好处,包括降低运营成本、高可扩缩性和便利平行处理。这些系统也是处理大规模、未知和动态环境的良好选择。然而,在这些环境中的学习已成为各种应用中的一个非常重要的挑战。这些挑战包括搜索空间的大小对学习时间的影响、代理机构之间合作效率低下以及代理机构决策之间缺乏适当协调。此外,在这些问题上,强化学习算法可能会因长期趋同而受影响。在本文件中,引入了使用知识转移概念的通信框架,以应对在大型州空间放牧问题中遇到的挑战。为了处理趋同问题,已经利用了知识转让,从而大大提高了强化学习算法的效率。各代理机构之间的协调分别通过每组代理机构的领导代理人和协调员代理人进行。结果表明,这一框架确实可以提高学习速度并缩短聚合时间。