Despite the potential of reinforcement learning (RL) for building general-purpose robotic systems, training RL agents to solve robotics tasks still remains challenging due to the difficulty of exploration in purely continuous action spaces. Addressing this problem is an active area of research with the majority of focus on improving RL methods via better optimization or more efficient exploration. An alternate but important component to consider improving is the interface of the RL algorithm with the robot. In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy. These parameterized primitives are expressive, simple to implement, enable efficient exploration and can be transferred across robots, tasks and environments. We perform a thorough empirical study across challenging tasks in three distinct domains with image input and a sparse terminal reward. We find that our simple change to the action interface substantially improves both the learning efficiency and task performance irrespective of the underlying RL algorithm, significantly outperforming prior methods which learn skills from offline expert data. Code and videos at https://mihdalal.github.io/raps/
翻译:尽管在建立通用机器人系统方面有加强学习(RL)的潜力,但培训RL代理商解决机器人任务仍具有挑战性,因为很难在纯粹连续的行动空间进行探索。解决这个问题是一个积极的研究领域,大部分重点是通过更优化或更高效的探索来改进RL方法。考虑改进的另一个但很重要的组成部分是RL算法与机器人的接口。在这项工作中,我们手工指定了一个机器人行动原始体(RAPS)图书馆,该图书馆的参数由RL政策所学的参数作为参数。这些参数化原始体是直观的,易于执行,能够有效探索,可以跨越机器人、任务和环境。我们在三个不同的领域对具有挑战性的任务进行彻底的经验研究,同时提供图像投入和稀少的最终奖励。我们发现,我们对行动界面的简单改变大大改进了学习效率和工作业绩,而不论基本的RL算法如何,大大超过从离线专家数据学习技能的以往方法。在 https://mihdal.github.io/raps/视频:https://michdal.