Natural language provides an accessible and expressive interface to specify long-term tasks for robotic agents. However, non-experts are likely to specify such tasks with high-level instructions, which abstract over specific robot actions through several layers of abstraction. We propose that key to bridging this gap between language and robot actions over long execution horizons are persistent representations. We propose a persistent spatial semantic representation method, and show how it enables building an agent that performs hierarchical reasoning to effectively execute long-term tasks. We evaluate our approach on the ALFRED benchmark and achieve state-of-the-art results, despite completely avoiding the commonly used step-by-step instructions.
翻译:自然语言为确定机器人代理人的长期任务提供了无障碍和直观的界面,然而,非专家可能以高层次的指示来具体指定此类任务,这些指示通过若干层抽象地对具体的机器人行动进行抽象的抽象的抽象抽象。我们提议,弥合语言与机器人行动之间在长期执行范围内的这种差距的关键是持续的代表性。我们建议了一种持续的空间语义表达法,并表明它如何能够使从事等级推理的代理人能够有效执行长期任务。我们评估了我们关于ALFRED基准的方法,并取得了最新的结果,尽管完全避免了常用的逐步指示。