Learning from demonstrations in the wild (e.g. YouTube videos) is a tantalizing goal in imitation learning. However, for this goal to be achieved, imitation learning algorithms must deal with the fact that the demonstrators and learners may have bodies that differ from one another. This condition -- "embodiment mismatch" -- is ignored by many recent imitation learning algorithms. Our proposed imitation learning technique, SILEM (\textbf{S}keletal feature compensation for \textbf{I}mitation \textbf{L}earning with \textbf{E}mbodiment \textbf{M}ismatch), addresses a particular type of embodiment mismatch by introducing a learned affine transform to compensate for differences in the skeletal features obtained from the learner and expert. We create toy domains based on PyBullet's HalfCheetah and Ant to assess SILEM's benefits for this type of embodiment mismatch. We also provide qualitative and quantitative results on more realistic problems -- teaching simulated humanoid agents, including Atlas from Boston Dynamics, to walk by observing human demonstrations.
翻译:从野生的演示中学习(例如YouTube视频)是模仿学习的一个诱人的目标。然而,要实现这一目标,模仿学习算法必须处理示威者和学习者可能身体彼此不同这一事实。这个条件 -- -- “充气不匹配” -- -- 被许多最近的模仿学习算法所忽略。我们提议的模仿学习技术、SILEM (\ textbf{I}mitation\ textbf{L}L}学习到\ textbf{E}mbodiment\ textbf{M}sismatch),通过引入学习的松动来弥补从学习者和专家那里获得的骨骼特征差异,来解决某种特定的化不匹配。我们创建了基于 PyBullet's LafCheetah 和 Ant 的玩具域,以评估SILEM 对这种变相的好处。我们还提供了更现实的问题的质和定量结果 -- 教授模拟人类代理人,包括波士顿动态地图集,通过观察人类的演示走向。