In this work, we present a memory-augmented approach for image-goal navigation. Our key hypothesis is that, while episodic reinforcement learning may be a convenient framework for tackling this task, embodied agents, once deployed, do not simply cease to exist after an episode has ended. They persist and so should their memories. Our approach leverages a cross-episode memory to learn to navigate. First, we train a state-embedding network in a self-supervised fashion, and then use it to embed previously-visited states into the agent's memory. Our navigation policy takes advantage of the information stored in the memory via an attention mechanism. We validate our approach through extensive evaluations, and show that our model establishes a new state of the art on the challenging Gibson dataset. We obtain this competitive performance from RGB input alone, without access to additional information such as position or depth.
翻译:在这项工作中,我们展示了对图像目标导航的记忆强化方法。我们的主要假设是,虽然偶发强化学习可能是完成这项任务的方便框架,但装饰剂一旦被部署,就不会在某一事件结束后完全消失。它们会继续存在,记忆也会因此消失。我们的方法利用一个交叉的记忆来学习导航。首先,我们以自我监督的方式培训一个州组成的网络,然后用它将先前访问过的国家嵌入代理人的记忆中。我们的导航政策利用了通过关注机制存储在记忆中的信息。我们通过广泛的评估验证了我们的方法,并展示了我们的模型在挑战性的吉布森数据集上建立了新的艺术状态。我们仅仅从 RGB 输入中获取这种竞争性的性能,没有机会获得诸如位置或深度等额外信息。