We introduce a simple new method for visual imitation learning, which allows a novel robot manipulation task to be learned from a single human demonstration, without requiring any prior knowledge of the object being interacted with. Our method models imitation learning as a state estimation problem, with the state defined as the end-effector's pose at the point where object interaction begins, as observed from the demonstration. By then modelling a manipulation task as a coarse, approach trajectory followed by a fine, interaction trajectory, this state estimator can be trained in a self-supervised manner, by automatically moving the end-effector's camera around the object. At test time, the end-effector moves to the estimated state through a linear path, at which point the original demonstration's end-effector velocities are simply replayed. This enables convenient acquisition of a complex interaction trajectory, without actually needing to explicitly learn a policy. Real-world experiments on 8 everyday tasks show that our method can learn a diverse range of skills from a single human demonstration, whilst also yielding a stable and interpretable controller.
翻译:我们引入了视觉模仿学习的简单新方法, 这样可以从一个人类的演示中学习新的机器人操作任务, 而不必事先知道该物体的交互作用。 我们的方法模型模拟学习是一个国家估计问题, 将状态定义为物体相互作用开始时的终端效应的表面。 从演示中可以看到, 这样可以将操纵任务模拟成粗糙的、 接近轨道的轨迹, 并随后有一个细微的互动轨迹。 这个州测量仪可以自我监督的方式, 自动将终端效应器的相机移动到物体周围。 在测试时, 终端效应或通过直线路径移动到估计状态, 此时原始演示的终端效应或速度会被简单重现。 这样可以方便地获取复杂的互动轨迹, 而无需明确学习政策 。 8个日常任务上的现实世界实验显示, 我们的方法可以从一个人类的演示中学习各种各样的技能, 同时生成一个稳定且可解释的控制器 。