Animals are able to imitate each others' behavior, despite their difference in biomechanics. In contrast, imitating the other similar robots is a much more challenging task in robotics. This problem is called cross domain imitation learning~(CDIL). In this paper, we consider CDIL on a class of similar robots. We tackle this problem by introducing an imitation learning algorithm based on invariant representation. We propose to learn invariant state and action representations, which aligns the behavior of multiple robots so that CDIL becomes possible. Compared with previous invariant representation learning methods for similar purpose, our method does not require human-labeled pairwise data for training. Instead, we use cycle-consistency and domain confusion to align the representation and increase its robustness. We test the algorithm on multiple robots in simulator and show that unseen new robot instances can be trained with existing expert demonstrations successfully. Qualitative results also demonstrate that the proposed method is able to learn similar representations for different robots with similar behaviors, which is essential for successful CDIL.
翻译:动物能够模仿彼此的行为, 尽管它们在生物机理上存在差异。 相反, 模仿其他类似机器人在机器人中是一项更具挑战性的任务。 这个问题被称为跨域模仿学习~( CDIL ) 。 在本文中, 我们考虑在类似机器人的类别中使用 CDIL 。 我们通过采用基于不变化的表达方式的模仿学习算法来解决这个问题。 我们建议学习不变化的状态和动作表达法, 它将多个机器人的行为与多机器人的行为相匹配, 以便CDIL 成为可能。 相比之下, 我们的方法与以往的变式表达法方法相比, 并不要求以人为标签的配对式学习方法来进行培训。 相反, 我们使用循环- 一致性和 域混淆来调整其表达方式, 并增强它的稳健性。 我们在模拟器中测试多个机器人的算法, 并表明可以成功地用现有的专家演示来训练未知的新机器人案例。 定性结果还表明, 与提议的方法能够为不同行为相似的机器人学习类似的表达法, 这对于CDIL 成功至关重要 。