利用知识转让,在合作多剂环境中改进强化学习 (Improved Reinforcement Learning in Cooperative Multi-agent Environments Using Knowledge Transfer)

Nowadays, cooperative multi-agent systems are used to learn how to achieve goals in large-scale dynamic environments. However, learning in these environments is challenging: from the effect of search space size on learning time to inefficient cooperation among agents. Moreover, reinforcement learning algorithms may suffer from a long time of convergence in such environments. In this paper, a communication framework is introduced. In the proposed communication framework, agents learn to cooperate effectively and also by introduction of a new state calculation method the size of state space will decline considerably. Furthermore, a knowledge-transferring algorithm is presented to share the gained experiences among the different agents, and develop an effective knowledge-fusing mechanism to fuse the knowledge learnt utilizing the agents' own experiences with the knowledge received from other team members. Finally, the simulation results are provided to indicate the efficacy of the proposed method in the complex learning task. We have evaluated our approach on the shepherding problem and the results show that the learning process accelerates by making use of the knowledge transferring mechanism and the size of state space has declined by generating similar states based on state abstraction concept.

翻译：目前,合作性多试剂系统被用来学习如何在大规模动态环境中实现目标。然而,在这些环境中的学习具有挑战性:从搜索空间的大小对学习时间的影响到代理人之间合作效率低下。此外,强化学习算法可能因在这种环境中的长期趋同而受影响。在本文件中,引入了一个通信框架。在拟议的通信框架中,代理人学习有效合作,并且通过采用新的国家计算方法,国家空间的规模将大大缩小。此外,还提出知识转让算法,在不同代理人之间分享所取得的经验,并开发有效的知识应用机制,利用代理人自身的经验将所学知识与从其他小组成员获得的知识结合起来。最后,提供模拟结果,以表明拟议的方法在复杂学习任务中的功效。我们评估了我们关于引导问题的方法,结果显示,通过利用知识转让机制和国家空间的规模,学习进程加快速度,因为根据国家抽象概念产生了类似的状态。