The Reinforcement Learning (RL) paradigm has been an essential tool for automating robotic tasks. Despite the advances in RL, it is still not widely adopted in the industry due to the need for an expensive large amount of robot interaction with its environment. Curriculum Learning (CL) has been proposed to expedite learning. However, most research works have been only evaluated in simulated environments, from video games to robotic toy tasks. This paper presents a study for accelerating robot learning of contact-rich manipulation tasks based on Curriculum Learning combined with Domain Randomization (DR). We tackle complex industrial assembly tasks with position-controlled robots, such as insertion tasks. We compare different curricula designs and sampling approaches for DR. Based on this study, we propose a method that significantly outperforms previous work, which uses DR only (No CL is used), with less than a fifth of the training time (samples). Results also show that even when training only in simulation with toy tasks, our method can learn policies that can be transferred to the real-world robot. The learned policies achieved success rates of up to 86\% on real-world complex industrial insertion tasks (with tolerances of $\pm 0.01~mm$) not seen during the training.
翻译:强化学习模式(RL)是机器人任务自动化的基本工具。尽管在RL方面有所进步,但由于需要大量机器人与环境进行昂贵的机器人互动,该模式在工业中仍没有被广泛采用。课程学习(CL)建议加快学习。然而,大多数研究工作仅在模拟环境中进行了评价,从视频游戏到机器人玩具任务。本文介绍了根据课程学习(DR)和Domain随机化(DR)相结合,加速机器人学习接触丰富的操纵任务的研究。我们处理复杂的工业组装任务,使用定位控制机器人,例如插入任务。我们比较了不同的课程设计和DR抽样方法。根据这项研究,我们提出了一种方法,大大超过以往的工作,只使用DR(CL),培训时间不到五分之一(样本)。结果还表明,即使只进行模拟 Toy任务的培训,我们的方法也可以学习可以转移到真实世界机器人的政策。在现实世界的复杂工业插入任务中,我们所学过的政策取得了86-美元的成功率,在现实世界的复杂工业插入任务中,没有看到0.01美元的容忍度。