Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.
翻译:以技能为基础的强化学习(RL)已成为利用先前知识加速机器人学习的一个有希望的战略。技能通常从专家演示中提取,并嵌入一个潜在的空间,作为高级RL代理机构的行动,可以对其进行抽样。然而,这种技能空间是广阔的,并非所有技能都与特定机器人国家相关,使探索变得困难。此外,下游RL代理机构仅限于学习与用于构建技能空间的任务在结构上相似的任务。我们首先提议利用国家条件的基因模型加速技能空间的探索,直接将高级代理机构偏向于仅与某个特定国家相关的抽样技能。接下来,我们提议对精细的熟练技能适应采取低水平的剩余政策,使下游RL代理机构能够适应不可见的任务变化。最后,我们验证了与用于构建技能空间不同的四项挑战性操作任务,表明我们有能力在显著加快探索的同时学习跨任务差异,业绩优于先前的工作。我们的项目网站 https://krishanrana.github.io/reskille。