Recently, deep Reinforcement Learning (RL) algorithms have achieved dramatically progress in the multi-agent area. However, training the increasingly complex tasks would be time-consuming and resources-exhausting. To alleviate this problem, efficient leveraging the historical experience is essential, which is under-explored in previous studies as most of the exiting methods may fail to achieve this goal in a continuously variational system due to their complicated design and environmental dynamics. In this paper, we propose a method, named "KnowRU" for knowledge reusing which can be easily deployed in the majority of the multi-agent reinforcement learning algorithms without complicated hand-coded design. We employ the knowledge distillation paradigm to transfer the knowledge among agents with the goal to accelerate the training phase for new tasks, while improving the asymptotic performance of agents. To empirically demonstrate the robustness and effectiveness of KnowRU, we perform extensive experiments on state-of-the-art multi-agent reinforcement learning (MARL) algorithms on collaborative and competitive scenarios. The results show that KnowRU can outperform the recently reported methods, which emphasizes the importance of the proposed knowledge reusing for MARL.
翻译:最近,深入强化学习(RL)算法在多试剂领域取得了巨大进展,然而,培训日益复杂的任务将是耗时和资源耗尽。为了缓解这一问题,有效利用历史经验至关重要,以往的研究对此没有进行深入探讨,因为大多数前期方法可能由于设计和环境动态复杂而无法在一个连续的变异系统中实现这一目标。在本文件中,我们提出了一个名为“知识RU”的知识再利用方法,在大多数多试剂强化学习算法中可以轻松地部署,而没有复杂的手工编码设计。我们使用知识蒸馏模式在代理人之间转让知识,目标是加快新任务的培训阶段,同时改进代理人的无药性表现。为了从经验上证明知识RU的强大性和有效性,我们在协作和竞争性情景上对最新技术多剂强化学习(MARL)算法进行了广泛的实验。结果显示,KnowRU可以超越最近报告的方法,这强调了为MAL重新使用知识的重要性。