Deep reinforcement learning (DRL) has been widely applied in autonomous exploration and mapping tasks, but often struggles with the challenges of sampling efficiency, poor adaptability to unknown map sizes, and slow simulation speed. To speed up convergence, we combine curriculum learning (CL) with DRL, and first propose a Cumulative Curriculum Reinforcement Learning (CCRL) training framework to alleviate the issue of catastrophic forgetting faced by general CL. Besides, we present a novel state representation, which considers a local egocentric map and a global exploration map resized to the fixed dimension, so as to flexibly adapt to environments with various sizes and shapes. Additionally, for facilitating the fast training of DRL models, we develop a lightweight grid-based simulator, which can substantially accelerate simulation compared to popular robot simulation platforms such as Gazebo. Based on the customized simulator, comprehensive experiments have been conducted, and the results show that the CCRL framework not only mitigates the catastrophic forgetting problem, but also improves the sample efficiency and generalization of DRL models, compared to general CL as well as without a curriculum. Our code is available at https://github.com/BeamanLi/CCRL_Exploration.
翻译:深度强化学习(DRL)已被广泛应用于自主探索和绘图任务,但往往在应对取样效率、对地图大小未知的不适应性和模拟速度慢等挑战方面挣扎。为了加速趋同,我们将课程学习(CL)与DRL相结合,并首先提议一个累积课程强化学习(CCRL)培训框架,以缓解总体CL面临的灾难性遗忘问题。 此外,我们提出了一个新的国家代表制,它考虑到一个本地的以自我为中心的地图和一个按固定尺寸调整的全球性勘探地图,以便灵活地适应不同大小和形状的环境。此外,为了便利快速培训DRL模型,我们开发了一个轻量级电网模拟器,与Gazebo等流行的机器人模拟平台相比,可以大大加快模拟速度。基于定制的模拟器,进行了全面实验,结果显示CCRLL框架不仅减轻了灾难性的遗忘问题,而且提高了DRL模型的样本效率和普及性,与一般CLLL模型相比,而且没有课程。我们的代码可在 https://Labus/bus/borum.exation.</s>