We develop a method for learning periodic tasks from visual demonstrations. The core idea is to leverage periodicity in the policy structure to model periodic aspects of the tasks. We use active learning to optimize parameters of rhythmic dynamic movement primitives (rDMPs) and propose an objective to maximize the similarity between the motion of objects manipulated by the robot and the desired motion in human video demonstrations. We consider tasks with deformable objects and granular matter whose states are challenging to represent and track: wiping surfaces with a cloth, winding cables/wires, and stirring granular matter with a spoon. Our method does not require tracking markers or manual annotations. The initial training data consists of 10-minute videos of random unpaired interactions with objects by the robot and human. We use these for unsupervised learning of a keypoint model to get task-agnostic visual correspondences. Then, we use Bayesian optimization to optimize rDMPs from a single human video demonstration within few robot trials. We present simulation and hardware experiments to validate our approach.
翻译:我们开发了一种从视觉演示中学习定期任务的方法。 核心理念是利用政策结构中的周期性来模拟任务的周期性。 我们使用积极学习来优化有节奏动态原始运动(rDMPs)的参数,并提出一个目标,以尽可能扩大机器人操纵的物体的运动与人类视频演示中所需运动的相似性。 我们考虑的是具有可变物体和颗粒物质的任务,这些物体和颗粒物质的状态代表和跟踪有挑战性:用布、风线/电线擦拭表面,用勺子搅拌颗粒物质。 我们的方法不需要跟踪标记或手动说明。 我们的初步培训数据包括10分钟的视频,与机器人和人类的物体进行随机、无孔不的相互作用。 我们用这些数据在不受监督的情况下学习一个关键点模型,以获得任务- 认知性视觉通信。 然后, 我们用巴耶斯优化来在少数机器人试验中从单一的人类视频演示中优化RDMPs。 我们提出模拟和硬件实验,以验证我们的方法。