We study the problem of developing autonomous agents that can follow human instructions to infer and perform a sequence of actions to complete the underlying task. Significant progress has been made in recent years, especially for tasks with short horizons. However, when it comes to long-horizon tasks with extended sequences of actions, an agent can easily ignore some instructions or get stuck in the middle of the long instructions and eventually fail the task. To address this challenge, we propose a model-agnostic milestone-based task tracker (M-TRACK) to guide the agent and monitor its progress. Specifically, we propose a milestone builder that tags the instructions with navigation and interaction milestones which the agent needs to complete step by step, and a milestone checker that systemically checks the agent's progress in its current milestone and determines when to proceed to the next. On the challenging ALFRED dataset, our M-TRACK leads to a notable 33% and 52% relative improvement in unseen success rate over two competitive base models.
翻译:我们研究开发自主代理器的问题,这些代理器可以遵循人类的指示来推断和进行一系列行动以完成基本任务。近年来取得了显著进展,特别是在短视范围内的任务方面。然而,当涉及具有长期行动序列的长期横向任务时,一个代理器可以很容易地忽略某些指示,或者被卡在长指令的中间,最终无法完成这项任务。为了应对这一挑战,我们提议了一个基于模型的、不可知的里程碑式任务跟踪器(M-TRACK)来指导该代理器并监测其进展情况。具体地说,我们提议建立一个里程碑式的构建器,用导航和互动里程碑标出该代理器需要一步步完成的导航和互动里程碑,以及一个里程碑式检查器,系统检查该代理器在目前里程碑中的进展,并确定何时进入下一个里程碑。在具有挑战性的ALFRED数据集中,我们的M-TRACK在两个竞争性基准模型的不可知成功率方面,导致显著的33%和52%的相对改善率。