Recent years have seen an increasing number of applications that have a natural language interface, either in the form of chatbots or via personal assistants such as Alexa (Amazon), Google Assistant, Siri (Apple), and Cortana (Microsoft). To use these applications, a basic dialog between the robot and the human is required. While this kind of dialog exists today mainly within "static" robots that do not make any movement in the household space, the challenge of reasoning about the information conveyed by the environment increases significantly when dealing with robots that can move and manipulate objects in our home environment. In this paper, we focus on cognitive robots, which have some knowledge-based models of the world and operate by reasoning and planning with this model. Thus, when the robot and the human communicate, there is already some formalism they can use - the robot's knowledge representation formalism. Our goal in this research is to translate natural language utterances into this robot's formalism, allowing much more complicated household tasks to be completed. We do so by combining off-the-shelf SOTA language models, planning tools, and the robot's knowledge-base for better communication. In addition, we analyze different directive types and illustrate the contribution of the world's context to the translation process.
翻译:近些年来,有自然语言界面的应用程序越来越多,这些应用程序有的是聊天机器人(Amazon)、谷歌助理、Siri(Apple)和Cortana(Microsoft)等自然语言界面,有的是聊天机器人(Amazon)、谷歌助理、Siri(Apple)和Cortana(Microsoft)等个人助理。使用这些应用程序,机器人和人类之间需要一个基本对话。虽然今天这种对话主要存在于“静态”机器人内部,在家庭空间没有任何运动,但环境传递的信息的推理挑战在与可以在我们国内环境中移动和操控物体的机器人打交道时大大增加。在这份文件中,我们注重认知机器人,这些机器人拥有一些基于知识的世界模型,并运用这些模型进行推理和规划操作。因此,当机器人和人类进行交流时,他们已经能够使用某种形式主义 — — 机器人的知识代表形式主义。我们这项研究的目标是将自然语言的言辞转化为机器人的形式主义,从而完成更为复杂的家务任务。我们这样做,方法是将现有的SATA语言模型、规划工具以及机器人知识流到更好的世界翻译中。