Reinforcement learning has shown great promise in the training of robot behavior due to the sequential decision making characteristics. However, the required enormous amount of interactive and informative training data provides the major stumbling block for progress. In this study, we focus on accelerating reinforcement learning (RL) training and improving the performance of multi-goal reaching tasks. Specifically, we propose a precision-based continuous curriculum learning (PCCL) method in which the requirements are gradually adjusted during the training process, instead of fixing the parameter in a static schedule. To this end, we explore various continuous curriculum strategies for controlling a training process. This approach is tested using a Universal Robot 5e in both simulation and real-world multi-goal reach experiments. Experimental results support the hypothesis that a static training schedule is suboptimal, and using an appropriate decay function for curriculum learning provides superior results in a faster way.
翻译:强化学习在培养机器人行为方面显示出了巨大的希望,因为有先后决策的特点。然而,所需的大量互动和内容丰富的培训数据是取得进展的主要障碍。在本研究中,我们侧重于加快强化学习(RL)培训和改进多目标任务的业绩。具体地说,我们建议采用基于精确的连续课程学习方法,在培训过程中逐步调整要求,而不是在固定的时间安排中确定参数。为此,我们探索了控制培训过程的各种连续课程战略。在模拟和现实世界多目标到达实验中,使用通用机器人5e测试这一方法。实验结果支持一种假设,即静态培训时间表不理想,使用适当的衰变功能进行课程学习,以更快的方式提供优异的结果。