Most prior methods for learning navigation policies require access to simulation environments, as they need online policy interaction and rely on ground-truth maps for rewards. However, building simulators is expensive (requires manual effort for each and every scene) and creates challenges in transferring learned policies to robotic platforms in the real-world, due to the sim-to-real domain gap. In this paper, we pose a simple question: Do we really need active interaction, ground-truth maps or even reinforcement-learning (RL) in order to solve the image-goal navigation task? We propose a self-supervised approach to learn to navigate from only passive videos of roaming. Our approach, No RL, No Simulator (NRNS), is simple and scalable, yet highly effective. NRNS outperforms RL-based formulations by a significant margin. We present NRNS as a strong baseline for any future image-based navigation tasks that use RL or Simulation.
翻译:学习导航政策的多数先前方法都需要使用模拟环境,因为它们需要在线政策互动,并依靠地面真实地图来获取回报。 但是,建筑模拟器费用昂贵(每个场景都需要人工操作),并且由于模拟到真实领域的差距,在将学习的政策转移到现实世界的机器人平台方面造成了挑战。 在本文中,我们提出了一个简单的问题:我们真的需要积极的互动、地面真实地图甚至强化学习(RL)来解决图像目标导航任务吗?我们建议了一种自我监督的方法来学习仅从被动的漫游视频中导航。我们的方法,即没有RL,没有模拟器(NRCNS)是简单和可缩放的,但却非常有效。NRCNS用一个显著的边距将基于RL的配制成比。我们将NNS作为未来使用RL或模拟的图像导航任务的强有力基准。