Contemporary approaches to perception, planning, estimation, and control have allowed robots to operate robustly as our remote surrogates in uncertain, unstructured environments. This progress now creates an opportunity for robots to operate not only in isolation, but also with and alongside humans in our complex environments. Realizing this opportunity requires an efficient and flexible medium through which humans can communicate with collaborative robots. Natural language provides one such medium, and through significant progress in statistical methods for natural-language understanding, robots are now able to interpret a diverse array of free-form commands. However, most contemporary approaches require a detailed, prior spatial-semantic map of the robot's environment that models the space of possible referents of an utterance. Consequently, these methods fail when robots are deployed in new, previously unknown, or partially-observed environments, particularly when mental models of the environment differ between the human operator and the robot. This paper provides a comprehensive description of a novel learning framework that allows field and service robots to interpret and correctly execute natural-language instructions in a priori unknown, unstructured environments. Integral to our approach is its use of language as a "sensor" -- inferring spatial, topological, and semantic information implicit in the utterance and then exploiting this information to learn a distribution over a latent environment model. We incorporate this distribution in a probabilistic, language grounding model and infer a distribution over a symbolic representation of the robot's action space. We use imitation learning to identify a belief-space policy that reasons over the environment and behavior distributions. We evaluate our framework through a variety navigation and mobile-manipulation experiments.
翻译:当代的认知、规划、估计和控制方法使得机器人能够随着我们远程代用在不确定、不结构化环境中的代用而进行强有力的操作。这一进展现在为机器人不仅在与我们复杂的环境中与人类一起和与人类一起运行创造了机会。实现这一机会需要一种高效和灵活的媒介,人类可以通过这种媒介与协作机器人进行交流。自然语言提供了一种这样的媒介,并通过在统计方法上为理解自然语言取得显著进展,机器人现在能够解释各种各样的自由形式指令。然而,大多数当代方法需要详细、事先的空间-中层的信念环境图,以模拟可能引用的言语空间空间。因此,当机器人被部署在新的、以前未知的或部分观察的环境中时,这些方法就会失败,特别是当人类操作者和机器人之间的环境心理模型模型提供了一种全新的学习框架,使外地和服务机器人能够解释和正确执行先前未知的、非结构化的多种语言指令。我们的方法的集成于一个方法,就是在“上层的分布”中,我们用一种虚拟的流化的流化的流化的流化的流化的流化的流化的流学,我们用一种语言,在一种在“上和层的流化的流化的流化的流化的流化的流化的流环境中,我们用的一种语言的流的流的流的流化的流化的流取的流的流的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流的流的流成的流的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流的流的流的流的流的流的流的流的流的流成的流成的流成的流成的流成的流成的流的流的流成的流成的流成的流成的流的流的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成的流成