In this paper, we propose a data-driven skill learning approach to solve highly dynamic manipulation tasks entirely from offline teleoperated play data. We use a bilateral teleoperation system to continuously collect a large set of dexterous and agile manipulation behaviors, which is enabled by providing direct force feedback to the operator. We jointly learn the state conditional latent skill distribution and skill decoder network in the form of goal-conditioned policy and skill conditional state transition dynamics using a two-stage generative modeling framework. This allows one to perform robust model-based planning, both online and offline planning methods, in the learned skill-space to accomplish any given downstream tasks at test time. We provide both simulated and real-world dual-arm box manipulation experiments showing that a sequence of force-controlled dynamic manipulation skills can be composed in real-time to successfully configure the box to the randomly selected target position and orientation; please refer to the supplementary video, https://youtu.be/LA5B236ILzM.
翻译:在本文中,我们建议采用数据驱动技能学习方法,完全从离线远程操作游戏数据中解决高度动态操纵任务。我们使用双边远程操作系统,不断收集大量灵活机动操作行为,这可以通过向操作员提供直接的武力反馈来实现。我们共同学习国家有条件的潜在技能分配和技能解码网络,其形式为:以有目标限制的政策和技能有条件的状态过渡动态为形式,采用两个阶段的基因化模型框架。这样,人们就可以在学习的技能空间中进行强有力的模型规划,包括在线和离线规划方法,以便在测试时完成任何特定的下游任务。我们提供模拟和现实世界的双臂操纵实验,表明可以实时组合出一系列武力控制的动态操纵技能,以便成功地将盒子配置到随机选择的目标位置和方向;请参见补充视频,https://youtu.be/LA5B236ILzM。