The co-adaptation of robots has been a long-standing research endeavour with the goal of adapting both body and behaviour of a system for a given task, inspired by the natural evolution of animals. Co-adaptation has the potential to eliminate costly manual hardware engineering as well as improve the performance of systems. The standard approach to co-adaptation is to use a reward function for optimizing behaviour and morphology. However, defining and constructing such reward functions is notoriously difficult and often a significant engineering effort. This paper introduces a new viewpoint on the co-adaptation problem, which we call co-imitation: finding a morphology and a policy that allow an imitator to closely match the behaviour of a demonstrator. To this end we propose a co-imitation methodology for adapting behaviour and morphology by matching state distributions of the demonstrator. Specifically, we focus on the challenging scenario with mismatched state- and action-spaces between both agents. We find that co-imitation increases behaviour similarity across a variety of tasks and settings, and demonstrate co-imitation by transferring human walking, jogging and kicking skills onto a simulated humanoid.
翻译:机器人的共适应是一项长期的研究工作,目的是在动物自然进化的启发下,使一个系统的身体和行为适应特定任务。共适应有可能消除昂贵的人工硬件工程,并改进系统的性能。共同适应的标准方法是利用奖励功能优化行为和形态学。然而,界定和构建这种奖励功能是众所周知的困难,而且往往是一项重大的工程工作。本文介绍了关于共适应问题的新观点,我们称之为共适应问题:找到一种形态学和政策,使模仿者能够密切匹配示范者的行为。为此,我们提出一种共同调整行为和形态的方法,将示范者的国家分布相匹配。具体地说,我们侧重于两种代理人之间不匹配的状态和行动空间的具有挑战性的设想。我们发现,共模仿增加了各种任务和环境中的相似性,通过将人类行走、慢步和踢动技能转移到模拟人类的模具,来展示共适应。