Learning from Demonstration (LfD) is a popular approach that allows humans to teach robots new skills by showing the correct way(s) of performing the desired skill. Human-provided demonstrations, however, are not always optimal and the teacher usually addresses this issue by discarding or replacing sub-optimal (noisy or faulty) demonstrations. We propose a novel LfD representation that learns from both successful and failed demonstrations of a skill. Our approach encodes the two subsets of captured demonstrations (labeled by the teacher) into a statistical skill model, constructs a set of quadratic costs, and finds an optimal reproduction of the skill under novel problem conditions (i.e. constraints). The optimal reproduction balances convergence towards successful examples and divergence from failed examples. We evaluate our approach through several 2D and 3D experiments in real-world using a UR5e manipulator arm and also show that it can reproduce a skill from only failed demonstrations. The benefits of exploiting both failed and successful demonstrations are shown through comparison with two existing LfD approaches. We also compare our approach against an existing skill refinement method and show its capabilities in a multi-coordinate setting.
翻译:从演示中学习(LfD)是一种流行的方法,它使人类能够通过展示正确的方式来传授机器人新的技能,从而展示出所需的技能。然而,人类提供的演示并非总是最理想的,教师通常通过丢弃或取代亚最佳(噪音或错误)的演示来解决这一问题。我们建议了一种新型的LfD代表方式,既学习成功又学习失败的演示技能。我们的方法将被捕获的演示的两个子集(教师贴上标签)编码成统计技能模型,建立一套等式成本,并在新的问题条件下(即制约)找到最佳技能复制。最佳复制方式平衡了成功范例和与失败实例的差异。我们用UR5操纵臂评估了现实世界中的多个2D和3D实验方法,还表明它只能复制失败的演示技能。通过与现有的两种LfD方法进行比较,展示了利用失败和成功演示的好处。我们还比较了我们的方法与现有技能改进方法的对比,并展示了在多坐标设置中的能力。