Systems for language-guided human-robot interaction must satisfy two key desiderata for broad adoption: adaptivity and learning efficiency. Unfortunately, existing instruction-following agents cannot adapt, lacking the ability to incorporate online natural language supervision, and even if they could, require hundreds of demonstrations to learn even simple policies. In this work, we address these problems by presenting Language-Informed Latent Actions with Corrections (LILAC), a framework for incorporating and adapting to natural language corrections - "to the right," or "no, towards the book" - online, during execution. We explore rich manipulation domains within a shared autonomy paradigm. Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot: language is an input to a learned model that produces a meaningful, low-dimensional control space that the human can use to guide the robot. Each real-time correction refines the human's control space, enabling precise, extended behaviors - with the added benefit of requiring only a handful of demonstrations to learn. We evaluate our approach via a user study where users work with a Franka Emika Panda manipulator to complete complex manipulation tasks. Compared to existing learned baselines covering both open-loop instruction following and single-turn shared autonomy, we show that our corrections-aware approach obtains higher task completion rates, and is subjectively preferred by users because of its reliability, precision, and ease of use.
翻译:语言引导的人类机器人互动系统必须满足两种关键条件,以便广泛采用:适应性和学习效率。 不幸的是,现有的指导人员无法适应,缺乏纳入在线自然语言监督的能力,即使他们有能力,也需要数百个演示来学习甚至简单的政策。在这项工作中,我们通过展示语言整合的隐性行动与校正(LILAC)来解决这些问题,这是一个将自然语言校正(“向右”或“不,走向书本”)纳入和适应的框架 — — 在线执行过程中。我们探索了共同自主模式中丰富的操纵领域。我们通过用户研究,让用户在人与机器人之间独立转弯曲,LILAC将机构分为人与机器人:语言是有助于产生有意义、低度控制空间的学习模式的一种投入,人类可以用来指导机器人。每次实时校正都改进了人类的控制空间,允许准确、扩展的行为 — — 额外的好处是只需要少量的演示才能学习。我们通过用户与弗兰克·埃米卡操纵师一起工作的方法来评估我们的方法。我们通过一个用户与一个更简易的校正的校正的校正率来显示我们完成的校正的校正率,比较的校正的校正标准,因为我们通过现有的学习基准,我们得到了共同的校正的校正。