Real-world autonomous missions often require rich interaction with nearby objects, such as doors or switches, along with effective navigation. However, such complex behaviors are difficult to learn because they involve both high-level planning and low-level motor control. We present a novel framework, Cascaded Compositional Residual Learning (CCRL), which learns composite skills by recursively leveraging a library of previously learned control policies. Our framework learns multiplicative policy composition, task-specific residual actions, and synthetic goal information simultaneously while freezing the prerequisite policies. We further explicitly control the style of the motion by regularizing residual actions. We show that our framework learns joint-level control policies for a diverse set of motor skills ranging from basic locomotion to complex interactive navigation, including navigating around obstacles, pushing objects, crawling under a table, pushing a door open with its leg, and holding it open while walking through it. The proposed CCRL framework leads to policies with consistent styles and lower joint torques, which we successfully transfer to a real Unitree A1 robot without any additional fine-tuning.
翻译:现实世界自治飞行任务往往需要与附近物体(如门或开关)进行丰富的互动,同时需要有效的导航。然而,这种复杂的行为很难学习,因为它们既涉及高层规划,又涉及低级别的运动控制。我们提出了一个新颖的框架,即连锁组合残存学习(CCRL),它通过反复利用一个以前学过的控制政策图书馆来学习综合技能。我们的框架在冻结必要政策的同时,还学习多式政策构成、特定任务剩余行动以及合成目标信息。我们通过使剩余行动正规化来进一步明确控制运动的风格。我们表明,我们的框架学习了一套从基本移动到复杂互动导航的各种机动技能的联合控制政策,包括绕过障碍、推车物体、在桌子下爬行、用脚把门拉开,并在穿过时保持开放。拟议的CCCRL框架导致具有一致风格和低级组合式的政策,我们成功地将这种政策转让给真正的Unree A1机器人,而不做任何额外的微调。