创建具有模拟和自我监督学习的多模式互动工具 (Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning)

DeepMind Interactive Agents Team,Josh Abramson,Arun Ahuja,Arthur Brussee,Federico Carnevale,Mary Cassin,Felix Fischer,Petko Georgiev,Alex Goldin,Tim Harley,Felix Hill,Peter C Humphreys,Alden Hung,Jessica Landon,Timothy Lillicrap,Hamza Merzic,Alistair Muldal,Adam Santoro,Guy Scully,Tamara von Glehn,Greg Wayne,Nathaniel Wong,Chen Yan,Rui Zhu

A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. We show that imitation learning of human-human interactions in a simulated world, in conjunction with self-supervised learning, is sufficient to produce a multimodal interactive agent, which we call MIA, that successfully interacts with non-adversarial humans 75% of the time. We further identify architectural and algorithmic techniques that improve performance, such as hierarchical action selection. Altogether, our results demonstrate that imitation of multi-modal, real-time human behaviour may provide a straightforward and surprisingly effective means of imbuing agents with a rich behavioural prior from which agents might then be fine-tuned for specific purposes, thus laying a foundation for training capable agents for interactive robots or digital assistants. A video of MIA's behaviour may be found at https://youtu.be/ZFgRhviF7mY

翻译：科幻小说的一个共同愿景是机器人有一天会占据我们的物理空间,感受世界,像我们一样,感受世界,帮助我们的体力劳动,并通过自然语言与我们交流。在这里,我们研究如何设计人造物剂,利用简化虚拟环境与人类自然互动。我们显示,模仿模拟世界中人与人互动的学习,加上自我监督的学习,就足以产生一种多式互动剂,我们称之为MIA,在75%的时间里成功地与非对抗性人类互动。我们进一步确定改善性能的建筑和算法技术,例如等级动作选择。我们的结果总共表明,模拟多模式实时人类行为可以提供直截而令人惊讶的有效手段,向具有丰富行为前的物证者灌输丰富的行为,然后可以为此为特定目的进行微调,从而为培养具有能力的互动机器人或数字助理的物剂打下基础。在https://yotu.be/ZFgRhviF7MYYY上可以找到MIA行为的视频。