Learning by demonstration is a versatile and rapid mechanism for transferring motor skills from a teacher to a learner. A particular challenge in imitation learning is the so-called correspondence problem, which involves mapping actions between a teacher and a learner having substantially different embodiments (say, human to robot). We present a general, model free and non-parametric imitation learning algorithm based on regression between two Hilbert spaces. We accomplish this via Kirszbraun's extension theorem --- apparently the first application of this technique to supervised learning --- and analyze its statistical and computational aspects. We begin by formulating the correspondence problem in terms of quadratically constrained quadratic program (QCQP) regression. Then we describe a procedure for smoothing the training data, which amounts to regularizing hypothesis complexity via its Lipschitz constant. The Lipschitz constant is tuned via a Structural Risk Minimization (SRM) procedure, based on the covering-number risk bounds we derive. We apply our technique to a static posture imitation task between two robotic manipulators with different embodiments, and report promising results.
翻译:示范学习是一种将运动技能从教师向学习者转移的多功能和快速机制。模仿学习的一个特殊挑战是所谓的通信问题,它涉及教师与具有截然不同的化身(如人与机器人)的学习者之间的绘图行动。我们根据两个希尔伯特空间之间的回归,提出了一个通用的、无模型的和非参数的模拟学习算法。我们通过Kirszbraun的扩展理论来完成这项工作 -- -- 显然这是首次应用这一技术来监督学习 -- -- 并分析其统计和计算方面。我们首先从具有不同化名的两个机器人操纵者之间的静态姿势模拟任务开始,然后我们描述一个平滑培训数据的程序,这相当于通过Lipschitz恒定的常态使假设复杂性正规化。利普施奇茨常数通过一个结构风险最小化(SRM)程序,根据我们得出的覆盖数量风险界限来调整。我们将我们的技术应用于两个具有不同化的机器人操纵者之间的静态姿势任务,并报告有希望的结果。