Our work focuses on the Multi-Object Navigation (MultiON) task, where an agent needs to navigate to multiple objects in a given sequence. We systematically investigate the inherent modularity of this task by dividing our approach to contain four modules: (a) an object detection module trained to identify objects from RGB images, (b) a map building module to build a semantic map of the observed objects, (c) an exploration module enabling the agent to explore its surroundings, and finally (d) a navigation module to move to identified target objects. We focus on the navigation and the exploration modules in this work. We show that we can effectively leverage a PointGoal navigation model in the MultiON task instead of learning to navigate from scratch. Our experiments show that a PointGoal agent-based navigation module outperforms analytical path planning on the MultiON task. We also compare exploration strategies and surprisingly find that a random exploration strategy significantly outperforms more advanced exploration methods. We additionally create MultiON 2.0, a new large-scale dataset as a test-bed for our approach.
翻译:我们的研究重点是多目标导航(MultiON)任务,代理需要按给定的顺序导航到多个对象。我们系统地研究了这个任务固有的模块化特性,将我们的方法分为四个模块: (a) 一个对象检测模块,用于从RGB图像中识别对象,(b) 一个地图构建模块,用于构建观察到的对象的语义地图,(c) 一个探索模块,使代理能够探索周围环境,最后 (d) 一个导航模块,用于移动到识别出的目标对象。我们在本文中重点研究导航和探索模块。 我们展示了我们可以有效地利用基于PointGoal的导航模型来解决MultiON任务,而不是从头开始学习导航。 我们的实验表明,基于PointGoal的代理导航模块在MultiON任务上的性能优于解析路径规划。我们还比较了探索策略,发现随机探索策略比更高级的探索方法表现显著更好。我们额外创建了MultiON 2.0,这是一个新的大规模数据集,作为我们方法的一个测试平台。