Our goal is to enable a robot to learn how to sequence its actions to perform tasks specified as natural language instructions, given successful demonstrations from a human partner. The ability to plan high-level tasks can be factored as (i) inferring specific goal predicates that characterize the task implied by a language instruction for a given world state and (ii) synthesizing a feasible goal-reaching action-sequence with such predicates. For the former, we leverage a neural network prediction model, while utilizing a symbolic planner for the latter. We introduce a novel neuro-symbolic model, GoalNet, for contextual and task dependent inference of goal predicates from human demonstrations and linguistic task descriptions. GoalNet combines (i) learning, where dense representations are acquired for language instruction and the world state that enables generalization to novel settings and (ii) planning, where the cause-effect modeling by the symbolic planner eschews irrelevant predicates facilitating multi-stage decision making in large domains. GoalNet demonstrates a significant improvement (51%) in the task completion rate in comparison to a state-of-the-art rule-based approach on a benchmark data set displaying linguistic variations, particularly for multi-stage instructions.
翻译:我们的目标是使机器人能够学习如何排列其行动顺序,以履行自然语言指示所规定的任务,这是人类伙伴的成功演示。规划高层次任务的能力可被考虑为:(一) 推断特定世界状态语言教学所意味着任务的特点的具体目标前提,(二) 将一个可行的、具有目标意义的行动序列与这种前提结合起来。对于前者,我们利用神经网络预测模型,同时利用一个象征性计划者为后者提供象征性计划者。我们引入了一个新的神经同步模型,即目标网,用于根据人类演示和语言任务描述对目标上游进行背景和任务推断。目标网结合了(一) 学习,为语言教学提供了密集的表述,使世界能够对新环境进行概括化,以及(二) 规划,通过象征性计划者进行因果关系模型,促进大型地区的多阶段决策。 目标网显示任务完成率有了显著的改进(51%),特别是相对于基于状态的、基于规则的指示的多阶段语言变化。